Human regulatory molecules

ABSTRACT

The invention provides human regulatory molecules and the polynucleotides which identify and encode them. The invention also provides expression vectors, host cells, agonists, antibodies and antagonists. The invention further provides methods for diagnosing and treating disorders associated with expression of human regulatory molecules.

[0001] This application is a divisional of U.S. Pat. No. 09/518,865filed 3 Mar. 2000, which was a divisional of U.S. Pat. No 6,132,973,issued 17 Oct. 2000, which was a divisional of U.S. Pat. No. 5,932,442,issued 3 Aug. 1999.

FIELD OF THE INVENTION

[0002] This invention relates to nucleic acid and amino acid sequencesof human regulatory molecules which are implicated in disease and to theuse of these sequences in the diagnosis and treatment of diseasesassociated with cell proliferation.

BACKGROUND OF THE INVENTION

[0003] Cells grow and differentiate, carry out their structural ormetabolic roles, participate in organismal development, and respond totheir environment by altering their gene expression. Cellular functionsare controlled by the timing and amount of expression attributable tothousands of individual genes. The regulation of expression ismetabolically vital in that it conserves energy and prevents thesynthesis and accumulation of intermediates such as RNA and incompleteor inactive proteins when the gene product is not needed.

[0004] Regulatory protein molecules function to control gene expression.These molecules turn individual or groups of genes on and off inresponse to various inductive mechanisms of the cell or organism; act astranscription factors by determining whether or not transcription isinitiated, enhanced, or repressed; and splice transcripts as dictated ina particular cell or tissue. Although regulatory molecules interact withshort stretches of DNA scattered throughout the entire genome, most geneexpression is regulated near the site at which transcription starts orwithin the open reading frame of the gene being expressed. The regulatedstretches of the DNA can be simple and interact with only a singleprotein, or they can require several proteins acting as part of acomplex in order to regulate gene expression.

[0005] The double helix structure and repeated sequences of DNA createexternal features which can be recognized by the regulatory molecules.These external features are hydrogen bond donor and acceptor groups,hydrophobic patches, major and minor grooves, and regular, repeatedstretches of sequence which cause distinct bends in the helix. Suchfeatures provide recognition sites for the binding of regulatoryproteins. Typically, these recognition sites are less than 20nucleotides in length although multiple sites may be adjacent to eachother, and each may exert control over a single gene. Hundreds of theseDNA sequences have been identified, and each is recognized by adifferent protein or complex of proteins which carry out generegulation.

[0006] The regulatory protein molecules or complexes recognize and bindto specific nucleotide sequences of upstream (5′) nontranslated regions,which precede the first translated exon of the open reading frame (ORF);of intron junctions, which occur between the many exons of the OR; andof downstream (3′) untranslated regions, which follow the ORF. Theregulatory molecule surface features are extensively complementary tothe surface features of the double helix. Even though each individualcontact between the protein(s) and helix may be relatively weak(hydrogen bonds, ionic bonds, and/or hydrophobic interactions) and the20 or more contacts occurring between the protein and DNA result in ahighly specific and very strong interaction.

Families of regulatory molecules

[0007] Many of the regulatory molecules incorporate one of a set ofDNA-binding structural motifs, each of which contains either α helicesor β sheets and binds to the major groove of DNA. Seven of thestructural motifs common to regulatory molecules are helix-turn-helix,homeodomains, zinc finger, steroid receptor, β sheets, leucine zipper,and helix-loop-helix.

[0008] The helix-turn-helix motif is constructed from two α helicesconnected by a short chain of amino acids, which constitutes the “turn”.The two helices interact with each other to form a fixed angle. The morecarboxy-terminal helix is called the recognition helix because it fitsinto the major groove of the DNA. The amino acid side chains of thehelix recognize the specific DNA sequence to which the protein binds.The H remaining structure varies a great deal among the regulatoryproteins incorporating this motif. The helix-turn-helix configuration isnot stable without the rest of the protein and will not bind to DNAwithout other peptide regions providing stability. Other peptide regionsalso interact with the DNA, increasing the number of unique sequences ahelix-turn-helix can recognize.

[0009] Many sequence-specific DNA binding proteins actually bind assymmetric dimers to DNA sequences that are composed of two very similarhalf-sites, also arranged symmetrically. This configuration allows eachprotein monomer to interact in the same way with the DNA recognitionsite and doubles the number of contacts with the DNA. This doubling ofcontacts greatly increases the binding affinity while only doubling thefree energy of the interaction. Helix-turn-helix motifs always bind toDNA that is in the B-DNA form.

[0010] The homeodomain motif is found in a special group ofhelix-turn-helix proteins that are encoded by homeotic selector genes,so called because the proteins encoded by these genes controldevelopmental switches. For example, mutations in these genes cause onebody part to be converted into another in the fruit fly, Drosophila.These genes have been found in every eukaryotic organism studied. Thehelix-turn-helix region of different homeodomains is always surroundedby the same structure, but not necessarily the same sequence, and themotif is always presented to DNA the same way. This helix-turn-helixconfiguration is stable by itself and, when isolated, can still bind toDNA. It may be significant that the helices in homeodomains aregenerally longer than the helices in most HLH regulatory proteins.Portions of the motif which interact most directly with DNA differ amongthese two families. Detailed examples of DNA-protein binding aredescribed in Pabo and Sauer (1992; Ann Rev Biochem 61:1053-95).

[0011] A third motif incorporates zinc molecules into the crucialportion of the protein. These proteins are most often referred to ashaving zinc fingers, although their structure can be one of severaltypes. Proteins in this family often contain tandem repeats of the30-residue zinc finger motif, including the sequence patterns Cys-X₂ or4-Cys-X₁₂-His-X₃₋-His. Each of these regulatory proteins has an a helixand an antiparallel β sheet. Two histidines in the α helix and 2cysteines near the turn in the β sheet interact with the zinc ion whichholds the α helix and the β sheet together. Contact with the DNA is madeby the arginine preceding the α helix, and by the second, third, andsixth residues of the α helix. When this arrangement is repeated as acluster of several fingers, α of each finger can contact and interactwith the major groove of the DNA. By changing the number of zincfingers, the specificity and strength of the binding interaction can bealtered.

[0012] The steroid receptors are a family of intracellular proteins thatinclude receptors for steroids, retinoids, vitamin D, thyroid hormones,and other important compounds. The DNA binding domain of these proteinscontains about 70 residues, eight of which are conserved cysteines. Thesteroid receptor motif forms a structure in which two a helices arepacked perpendicularly to each other, forming more of a globular shapethan a finger. Each helix has a zinc ion which holds a peptide loopagainst the N-terminal end of the helix. The first helix fits into themajor groove of DNA, and side chains make contacts with edges of the DNAbase pairs. The steroid receptor proteins, like the helix-turn-helixproteins, form dimers that bind the DNA. The second helix of eachmonomer contacts the phosphate groups of the DNA backbone and alsoprovides the dimerization interface. In some cases, multiple choices canexist for heterodimerization which produces another mechanism forfine-tuning the regulation of numerous genes.

[0013] Another family of regulatory protein molecules uses a motifconsisting of a two-stranded antiparallel βsheet to recognize the majorgroove of DNA. The exact DNA sequence recognized by the motif depends onthe amino acid sequence in the β sheet from which the amino acid sidechains extend and contact the DNA. In two prokaryotic examples of the βsheet, the regulatory proteins form tetramers when binding DNA.

[0014] The leucine zipper motif commonly forms dimers and has a 30-40residue motif in which two α helices (one from each monomer) are joinedto form a short coiled-coil. The helices are held together byinteractions among hydrophobic amino acid side chains (often onheptad-repeated leucines) that extend from one side of each helix.Beyond this, the helices separate, and each basic region contacts themajor groove of DNA. Proteins with the leucine zipper motif can alsoform either homodimers or heterodimers, thus extending the specificcombinations available to activate or repress expression.

[0015] Yet another motif is the helix-loop-helix, which consists of ashort α helix connected by a loop to a longer α helix. The loop isflexible and allows the two helices to fold back against each other. Theα helices bind both to DNA and to the HLH structure of another protein.The second protein can be the same (producing homodimers) or different(producing heterodimers). Some HLH monomers lack sufficient α helix tobind DNA, but they can still form heterodimers which can serve toinactivate specific regulatory proteins.

[0016] Hundreds of regulatory proteins have been identified to date, andmore are being characterized in a wide variety of organisms. Mostregulatory proteins have at least one of the common structural motifsfor making contact with DNA, but several regulatory proteins, such asthe p53 tumor suppressor gene, do not share their structure with otherknown regulatory proteins. Variations on the known motifs and new motifshave been and are currently being characterized (Faisst and Meyer (1992)Nucl Acids Res 20:3-26).

[0017] Although binding of DNA to a regulatory protein is very specific,there is no way to predict the exact DNA sequence to which a particularregulatory protein will bind or the primary structure of a regulatoryprotein for a specific DNA sequence. Thus, interactions of DNA andregulatory proteins are not limited to the motifs described above. Otherdomains of the proteins often form crucial contacts with the DNA, andaccessory proteins can provide interactions which may convert aparticular protein complex to an activator or a represser or may preventbinding (Alberts et al. (1994)Molecular Biology of the Cell, GarlandPublishing, New York NY, pp.40¹-74).

Diseases and disorders related to gene regulation

[0018] Many neoplastic growths in humans can be traced to problems ofgene regulation. Malignant growth of cells may be the result of excesstranscriptional activator or loss of an inhibitor or suppressor (Cleary(1992) Cancer Surv 15:89-104). Alternatively, gene fusion may producechimeric loci with switched domains, such that the level of activationis no longer correct for the gene specificity of that factor.

[0019] The cellular response to infection or trauma is beneficial whengenes are appropriately expressed. However, when hyper-responsivity oranother imbalance occurs for any reason, disregulation of geneexpression may cause considerable tissue or organ damage. This damage iswell documented in immunological responses to allergens, heart attack,stroke, and infections (Harrison's Principles of Internal Medicine,13/e©, (1994) McGraw Hill and Teton Data Systems, Jackson Wyo.). Inaddition, the accumulation of somatic mutations and the increasinginability to regulate cellular responses is seen in the prevalence ofosteoarthritis and onset of other aging disorders.

[0020] The discovery of new human regulatory protein molecules which areexpressed during disease development and the polynucleotides whichencode them satisfies a need in the art by providing compositions whichare useful in the diagnosis and treatment of diseases associated withcell proliferation, particularly immune responses and cancers.

SUMMARY OF THE INVENTION

[0021] The invention features purified proteins, human regulatorymolecules, collectively referred to as HRM and individually referred toas HRM-1 through HRM-49. In one embodiment, the purified proteincomprises an amino acid sequence selected from SEQ ID NO:1 through SEQID NO:49 and portions thereof..

[0022] The invention provides isolated polynucleotides encoding HRM andcomplements of the encoding polynucleotides. In one embodiment, thepolynucleotide comprises a nucleic acid sequence selected from SEQ IDNOs:50-98 and complements thereof.

[0023] The invention also provides a polynucleotide, or a complement ora fragment thereof, which is used as a probe to hybridize to any one ofthe polynucleotides of SEQ ID NOs:50-98. The invention further providesa composition comprising the isolated and purified polynucleotides ofSEQ ID NOs:50-98. In addition, the invention provides a compositioncomprising a polynucleotide selected from SEQ ID NOs:50-98 andcomplements and fragments thereof and a reporter molecule or stabilizingmoiety. The invention still further provides a method for detectingexpression of a polynucleotide which encodes a human regulatory moleculein a sample, the method comprising hybridizing the complement of apolynucleotide encoding HRM to nucleic acids of the sample underconditions to form a hybridization complex; and detecting hybridizationcomplex formation, wherein complex formation indicates the expression ofthe polynucleotide encoding the human regulatory molecule in the sample.In one aspect, the complement of the polynucleotide encoding HRM isimmobilized on a substrate. In another aspect, the substrate is amicroarray.

[0024] The invention provides a vector containing at least a fragment ofany one of the polynucleotides selected from SEQ ID NOs:50-98. In oneembodiment, the vector is contained within a host cell. The inventionalso provides a method for producing a protein or a portion thereof, themethod comprising culturing a host cell containing a vector containingat least a fragment of a polynucleotide encoding an HRM under conditionsfor the expression of the protein; and recovering the protein from thehost cell culture.

[0025] The invention further provides a composition comprising apurified HRM and a labeling moiety or a pharmaceutical carrier. Theinvention still further provides a method for using an HRM to screen aplurality of molecules in order to obtain a ligand which specificallybinds the HRM, the method comprising combining the protein with themolecules under conditions which allow specific binding, recovering thebound protein, separating the protein, thereby obtaining the ligand. Inone aspect, the molecules are selected from libraries of agonists,antibodies, antagonists, drugs, inhibitors, peptides, proteins, andpharmaceutical agents.

[0026] The invention still further provides a method for using a proteinto produce and purify an antibody, the method comprising immunizing aanimal with an HRM under conditions to elicit an antibody response;isolating animal antibodies; attaching the protein to a substrate;contacting the substrate with sera containing antibodies underconditions to allow specific binding to the HRM; dissociating theantibodies from the HRM, thereby obtaining purified antibodies.

[0027] The invention provides a purified antibody which specificallybinds an HRM. The invention also provides a method for using an antibodyto detect protein expression in a sample, the method comprisingcombining the antibody specifically binding HRM with a sample underconditions to form antibody:protein complexes and detecting complexformation, wherein detection indicates expression of the protein in thesample. In one aspect, expression of the HRM is diagnostic of cancer. Inanother aspect, expression is diagnostic of immune response.

[0028] The invention also provides a method for diagnosing a diseaseassociated with gene expression in a sample containing nucleic acids,the method comprising hybridizing a polynucleotide to nucleic acids ofthe sample under conditions to form a hybridization complex, comparinghybridization complex formation to standards, thereby diagnosing thedisease. In one aspect, the disease is selected from a disordercharacterized by cell proliferation such as a cancer, an developmentaldisorder, or an immune response.

[0029] The invention provides a method for treating a cancer comprisingadministering to a subject in need of such treatment a compositioncontaining purified HRM. The invention also provides a method fortreating a cancer comprising administering to a subject in need of suchtreatment an antagonist which specifically binds HRM. The inventionfurther provides a method for treating an immune response associatedwith the increased expression or activity of HRM comprisingadministering to a subject in need of such treatment an antagonist whichspecifically binds HRM. The invention still further provides a methodfor stimulating cell proliferation comprising administering purified HRMto a cell.

DESCRIPTION OF THE INVENTION

[0030] Before the present proteins, polynucleotide, and methods aredescribed, it is to be understood that this invention is not limited toany particular methodology, protocol, cell line, vector, or reagent, asthese may vary. It must also be noted that in the appended claims, thesingular forms “a”, “an”, and “the” include plural reference unless thecontext clearly dictates otherwise. For example, reference to “a hostcell” includes a plurality of such host cells; reference to an“antibody” includes one or more antibodies and equivalents thereof knownto those skilled in the art.

[0031] Unless defined herein, all technical and scientific terms havethe same meanings commonly understood by one of ordinary skill in theart to which this invention belongs. The terminology is used for thepurpose of describing particular embodiments and is not intended tolimit the scope of the present invention which will be limited only bythe appended claims. Although any methods and materials similar orequivalent to those described can be used in the practice or testing ofthe invention, the preferred methods, devices, and materials are nowdescribed. All publications are incorporated herein by reference for thepurpose of describing and disclosing the cell lines, vectors, arrays andmethodologies which are reported in the publications and which might beused in connection with the invention. Nothing herein is to be construedas an admission that the invention is not entitled to antedate suchdisclosure by virtue of prior invention.

[0032] DEFINITIONS

[0033] “Agonist” refers to a molecule which specifically binds to andmodulates the activity of HRM.

[0034] An “allele” is an alternative form of the polynucleotide or geneencoding HRM. Alleles result from at least one mutation in the nucleicacid sequence and may result in the expression of altered mRNAs orproteins whose structure or function may or may not be altered. Anygiven gene may have none, one, or many allelic forms. Common mutationalchanges which give rise to alleles are generally ascribed to additions,deletions, or substitutions of nucleotides. Each of these types ofchanges may occur alone, or in combination with the others, one or moretimes in a given sequence. Similarly a polynucleotide may be altered toproduce deliberate amino acid substitutions. Theses substitutions may bemade on the basis of similarity in polarity, charge, solubility,hydrophobicity, hydrophilicity, and/or the amphipathic nature of theresidues as long as the biological or immunological activity of HRM isretained. For example, negatively charged residues include aspartic acidand glutamic acid; positively charged residues include lysine andarginine; and residues with uncharged polar head groups having similarhydrophilicity values include leucine, isoleucine, and valine, glycineand alanine, asparagine and glutamine, serine and threonine, andphenylalanine and tyrosine.

[0035] “Antagonist” refers to a molecule which, when bound to HRM,decreases the amount or the duration of the biological or immunologicalactivity of HRM. Antagonists may include proteins, nucleic acids,carbohydrates, fats or any other molecules which decrease the effect ofHRM.

[0036] “Antibody” refers to intact molecules, or fragments thereof suchas Fa, F(ab′)₂, and Fv, which are capable of binding the antigenicdeterminant of an HRM.

[0037] “Biologically active” refers to a protein having structural,regulatory, or biochemical functions of a naturally occurring molecule.Likewise, “immunologically active” refers to the capability of thenatural, recombinant, or synthetic protein or peptide to induce aspecific immune response in animals or cells and to bind with specificantibodies.

[0038] “Complementary” refers to the natural binding of polynucleotidesunder permissive salt and temperature conditions by base-pairing. Forexample, the sequence “A-G-T” binds to the complementary sequence“T-C-A”. The degree of complementarity between nucleic acids hassignificant effects on the efficiency and strength of hybridization.This is important in amplification reactions and in the design and useof peptide nucleic acid molecules.

[0039] A “composition” refers to a combination comprising a plurality ofpolynucleotides or a specific polynucleotide or protein and at least oneother molecule. Such other molecules may include reporter molecules,labeling moieties, pharmaceutical carriers, carbohydrates, and the like.

[0040] “Consensus” refers to a nucleic acid sequence which has beenresequenced to resolve uncalled bases, has been extended using XL-PCRkit (Applied Biosystems (ABI), Foster City Calif.) in the 5′ and/or the3′ direction and resequenced, or has been assembled to full length fromoverlapping shorter fragments using a computer program for fragmentassembly such as that described in U.S. Ser. No. 09/276,534, filed 25Mar. 1999.

[0041] “Derivative” refers to the chemical modification of apolynucleotide or protein. Such modifications may include replacement ofhydrogen by an alkyl, acyl, or amino group. A nucleic acid derivativemay encode a protein which retains the biological or immunologicalfunction of the natural molecule. A derivative protein is one which ismodified by glycosylation, pegylation, or any similar process but stillretains the biological or immunological function of the native protein.

[0042] “Differential expression” refers to an increased, upregulated orpresent, or decreased, downregulated or absent, gene expression asdetected by presence, absence or at least about two-fold changes in theamount of transcribed messenger RNA or translated protein in a sample.

[0043] “Disorder” refers to a condition, disease or syndrome in which apolynucleotide or a protein of the invention is differentiallyexpressed. Such a disorder includes cancers or immune responses as theyare set forth below.

[0044] “HRM” refers to any one or all of the human proteins, HRMs 1-49,as it was obtained from any species including bovine, ovine, porcine,murine, equine, and preferably human, or from any source whethernatural, synthetic, semi-synthetic, or recombinant.

[0045] “Hybridization complex” refers to a complex formed between twonucleic acids by the formation of hydrogen bonds between complementarybase pairs; these hydrogen bonds form in an antiparallel configurationand may be further stabilized by base stacking interactions. Ahybridization complex may be formed in solution or between one nucleicacid present in solution and another nucleic acid immobilized on asubstrate.

[0046] “Isolated” refers to a polynucleotide that is removed from itsnatural environment or separated from other components with which it isnaturally associated.

[0047] “Ligand” refers to any agent, molecule, or compound which willbind specifically to a polynucleotide or to a protein. Such ligandsstabilize or modulate the activity of polynucleotides or proteins andmay be composed of inorganic and/or organic substances includingminerals, cofactors, nucleic acids, proteins, carbohydrates, fats, andlipids.

[0048] “Microarray” refers to an arrangement of distinct polynucleotideson a substrate “Oligonucleotide” refers to a nucleic acid sequence about6 nucleotides to about 60 nucleotides in length which may be used inamplification or hybridization assays. Equivalent terms include“amplimers”,“primers”, “oligomers”, and “probes”, as these are commonlydefined in the art. “Peptide nucleic acid” refers to an anti-gene agentwhich comprises an oligonucleotide of at least five nucleotides inlength linked to a peptide backbone of amino acid residues which ends ina terminal lysine which confers solubility to the molecule.

[0049] “Polynucleotide” refers to nucleic acid molecule having a nucleicacid sequence and to DNA or RNA of genomic or synthetic origin which maybe single- or double-stranded and represent the sense or antisensestrand. “Fragment” refers to a nucleic acid sequence which is more thanabout 60 nucleotides in length.

[0050] “Portion” refers to a fragment of an HRM which ranges in sizefrom five amino acid residues to the entire amino acid sequence minusone amino acid.

[0051] “Protein” refers to an oligopeptide, peptide, or polypeptidehaving an amino acid sequence whether naturally occurring or syntheticmolecules. Portions of HRM are preferably about 5 to about 15 aminoacids in length and retain the biological or the immunological activityof the HRM.

[0052] “Purified” refers to a peptide or protein that is removed fromits natural environment, isolated or separated from other componentswith which it is naturally associated.

[0053] “Reporter molecules” or “labeling moieties” includeradionuclides, enzymes, fluorescent, chemiluminescent, or chromogenicagents as well as substrates, cofactors, inhibitors, magnetic particles,and the like.

[0054] “Sample” is used in its broadest sense and may comprise a bodilyfluid, extract from a cell, chromosome, organelle, or membrane isolatedfrom a cell, a cell, genomic DNA, RNA, or cDNA (in solution or bound toa solid support), a tissue, a tissue print, and the like.

[0055] “Specific binding” refers to that interaction between apolynucleotide or protein of the invention and any ligand whichspecifically binds to it and which is selected from a DNA or an RNAmolecule, a peptide nucleic acid, a peptide, a protein, an agonist, anantibody, an antagonist, an inhibitor, a mimetic, a pharmaceuticalagent, a drug, a transcription factor, or an artificial chromosomeconstruction. The interaction is dependent upon the presence of aparticular sequence or three dimensional structure recognized by thebinding molecule.

[0056] “Substrate” refers to any rigid or semi-rigid support to whichpolynucleotides or proteins are bound and includes membranes, filters,chips, slides, wafers, fibers, magnetic or nomnmagnetic beads, gels,capillaries or other tubing, plates, polymers, and microparticles with avariety of surface forms including wells, trenches, pins, channels andpores.

[0057] “Variant” refers to molecules that are recognized variations of apolynucleotide or a protein encoded by the polynucleotide. Splicevariants may be determined by BLAST score, wherein the score is at least100, and most preferably at least 400. Allelic variants have a highpercent identity to the polynucleotides and may differ by about threebases per hundred bases. “Single nucleotide polymorphism” (SNP) refersto a change in a single base as a result of a substitution, insertion ordeletion. The change may be conservative (purine for purine) ornon-conservative (purine to pyrimidine) and may or may not result in achange in an encoded amino acid or its secondary, tertiary, orquaternary structure.

[0058] THE INVENTION

[0059] The invention is based on the discovery of human regulatorymolecules (HRM) and the polynucleotides encoding HRM, and on the use ofthese compositions for the diagnosis and treatment of diseasesassociated with cell proliferation. Table 1 shows the protein andpolynucleotide identification numbers, protein abbreviation, IncyteClone number, cDNA library, and the closest NCBI homolog and NCBIsequence identifier for each of the human regulatory molecules. TABLE 1Protein Nucleotide Abbreviation Clone ID Library NCBI Homolog SEQ IDNO:1 SEQ ID NO:50 HRM-1    133 U937NOT01 g285947 KIAA0105 SEQ ID NO:2SEQ ID NO:51 HRM-2    1762 U937NOT01 g1518121 Ascaris suum SEQ ID NO:3SEQ ID NO:52 HRM-3    1847 U937NOT01 g1302211 Saccharomyces cerevisiaeSEQ ID NO:4 SEQ ID NO:53 HRM-4    9337 HMC1NOT01 g1613852 Human zincfinger protein (zf2) SEQ ID NO:5 SEQ ID NO:54 HRM-5    9476 HMC1NOT01g755784 S. cerevisiae SEQ ID NO:6 SEQ ID NO:55 HRM-6   10370 THP1PLB01g895845 Human putative p64 CLCP protein SEQ ID NO:7 SEQ ID NO:56 HRM-7  30137 THP1NOB01 g1710241 Human clone 23733 mRNA SEQ ID NO:8 SEQ IDNO:57 HRM-8   77180 SYNORAB01 g5372 S. cerevisiae SEQ ID NO:9 SEQ IDNO:58 HRM-9   98974 PITUNOR01 g1627704 Caenorhabditis elegans SEQ IDNO:10 SEQ ID NO:59 HRM-10  118160 MUSCNOT01 g220594 Mus musculus SEQ IDNO:11 SEQ ID NO:60 HRM-11  140516 TLYMNOR01 g1086723 C. elegans SEQ IDNO:12 SEQ ID NO:61 HRM-12  207452 SPLNNOT02 g1314086 S. cerevisiae SEQID NO:13 SEQ ID NO:62 HRM-13  208836 SPLNNOT02 g662126 S. cerevisiae SEQID NO:14 SEQ ID NO:63 HRM-14  569710 MMLR3DT01 g1698719 Human zincfinger protein SEQ ID NO:15 SEQ ID NO:64 HRM-15  606742 BRSTTUT01g1710201 Human clone 23679 mRNA SEQ ID NO:16 SEQ ID NO:65 HRM-16  611135COLNNOT01 g506882 C elegans SEQ ID NO:17 SEQ ID NO:66 HRM-17  641127BRSTNOT03 g1310668 Human Hok-2 gene product SEQ ID NO:18 SEQ ID NO:67HRM-18  691768 LUNGTUT02 g309183 Mus musculus SEQ ID NO:19 SEQ ID NO:68HRM-19  724157 SYNOOAT01 g577542 C. elegans C16C10 SEQ ID NO:20 SEQ IDNO:69 HRM-20  864683 BRAITUT03 g1418563 C. elegans SEQ ID NO:21 SEQ IDNO:70 HRM-21  933353 CERVNOT01 g1657672 C. elegans SEQ ID NO:22 SEQ IDNO:71 HRM-22 1404643 LATRTUT02 g459002 C. elegans SEQ ID NO:23 SEQ IDNO:72 HRM-23 1561587 SPLNNOT04 g868266 C. elegans SEQ ID NO:24 SEQ IDNO:73 HRM-24 1568361 UTRSNOT05 g1834503 Human mucin SEQ ID NO:25 SEQ IDNO:74 HRM-25 1572888 LNODNOT03 g603396 S. cerevisiae YER156c SEQ IDNO:26 SEQ ID NO:75 HRM-26 1573677 LNODNOT03 g849195 S. cerevisiaeD9481.16 SEQ ID NO:27 SEQ ID NO:76 HRM-27 1574624 LNODNOT03 g1067025 C.elegans R07E5.14 SEQ ID NO:28 SEQ ID NO:77 HRM-28 1577239 LNODNOT03g728657 S. cerevisiae SEQ ID NO:29 SEQ ID NO:78 HRM-29 1598203 BLADNOT03g1200033 C. elegans F35G2 SEQ ID NO:30 SEQ ID NO:79 HRM-30 1600438BLADNOT03 g286001 KIAA0005 SEQ ID NO:31 SEQ ID NO:80 HRM-31 1600518BLADNOT03 g790405 C. elegans SEQ ID NO:32 SEQ ID NO:81 HRM-32 1602473BLADNOT03 g1574570 Haemophilus influenzae SEQ ID NO:33 SEQ ID NO:82HRM-33 1605720 LUNGNOT15 g1055080 C. elegans SEQ ID NO:34 SEQ ID NO:83HRM-34 1610501 COLNTUT06 g313741 S. cerevisiae YBL0514 SEQ ID NO:35 SEQID NO:84 HRM-35 1720770 BLADNOT06 g1006641 C. elegans F46C5 SEQ ID NO:36SEQ ID NO:85 HRM-36 1832295 BRAINON01 g561637 Human enigma protein SEQID NO:37 SEQ ID NO:86 HRM-37 1990522 CORPNOT02 g558396 S. cerevisiae SEQID NO:38 SEQ ID NO:87 HRM-38 2098087 BRAITUT02 g1066284 Mus musculusuterine mRNA SEQ ID NO:39 SEQ ID NO:88 HRM-39 2112230 BRAITUT03 g861306C. elegans SEQ ID NO:40 SEQ ID NO:89 HRM-40 2117050 BRSTTUT02 g687821 C.elegans SEQ ID NO:41 SEQ ID NO:90 HRM-41 2184712 SININOT01 g868241 C.elegans C56C10 SEQ ID NO:42 SEQ ID NO:91 HRM-42 2290475 BRAINON01g733605 C. elegans SEQ ID NO:43 SEQ ID NO:92 HRM-43 2353452 LUNGNOT20g1507666 Schizosaccharomyces pombe SEQ ID NO:44 SEQ ID NO:93 HRM-442469611 THP1NOT03 g1495332 C. elegans SEQ ID NO:45 SEQ ID NO:94 HRM-452515476 LIVRTUT04 g1665790 KIAA0262 SEQ ID NO:46 SEQ ID NO:95 HRM-462754573 THP1AZS08 g478990 Human RNA binding protein SEQ ID NO:47 SEQ IDNO:96 HRM-47 2926777 TLYMNOT04 g687823 C. elegans SEQ ID NO:48 SEQ IDNO:97 HRM-48 3217567 TESTNOT07 g1841547 Human HLA class III region SEQID NO:49 SEQ ID NO:98 HRM-49 3339274 SPLNNOT10 g1177434 Human mRNA

[0060] HRM-1 (SEQ ID NO:1) was identified in Incyte Clone 133 from theU937NOT01 CDNA library using a computer search for amino acid sequencealignments. A consensus sequence, SEQ ID NO:50, was derived from theextended and overlapping nucleic acid sequences: Incyte Clones 133(U937NOT01), 013508 (THP1PLB01), 210174 (SPLNNOT02), 1655863(PROSTUT08), 1725724 (PROSNOT14), 1858205 (PROSNOT18), and 2646014(OVARTUT05).

[0061] In one embodiment, the invention encompasses a protein comprisingthe amino acid sequence of SEQ ID NO: 1. HRM-1 is 151 amino acids inlength and has four potential phosphorylation sites at T2, S14, S69, andT111. HRM-1 has sequence homology with human KIAA0105 (g285947) and isfound in cDNA libraries which have proliferating cells and areassociated with cancer or immune response.

[0062] HRM-2 (SEQ ID NO:2) was identified in Incyte Clone 1762 from theU937NOT01 cDNA library using a computer search for amino acid sequencealignments. A consensus sequence, SEQ ID NO:51, was derived from theextended and overlapping nucleic acid sequences: Incyte Clones 1762(U937NOT01), 1254927 (LUNGFET03), and 2070865 (ISLTNOT01).

[0063] In one embodiment, the invention encompasses a protein comprisingthe amino acid sequence of SEQ ID NO:2. HRM-2 is 185 amino acids inlength and has a potential N glycosylation site at N108; eight potentialphosphorylation sites at T22, S26, T27, S31, T51, T70, and T135; aleucine zipper motif at L₁₃₆KDVVWGLNSLFTDLLNFDDPL; and a ubiquitinconjugation motif at W₁₀₅HPNITETGEICLSL. HRM-2 has sequence homologywith a gene from Ascaris suum (g1518121) and is found in cDNA librarieswhich have secretory or proliferating cells and are associated withdevelopment.

[0064] HRM-3 (SEQ ID NO:3) was identified in Incyte Clone 1847 from theU937NOT01 cDNA library using a computer search for amino acid sequencealignments. A consensus sequence, SEQ ID NO:52, was derived from theextended and overlapping nucleic acid sequences: Incyte Clones 274(U937NOT01), 1847 (U937NOT01), 262233 (HNT2AGTO1), 972977 (MUSCNOT02),and 1859611(PROSNOT18).

[0065] In one embodiment, the invention encompasses a protein comprisingthe amino acid sequence of SEQ ID NO:3. HRM-3 is 59 amino acids inlength and has four potential N glycosylation sites at N147, N352, N410,and N421, and 17 potential phosphorylation sites at S13, T21, S43, S89,S131, S207, T243, S278, T286, S335, S337, S350, S354, S369, S380, S412,and S542. HRM-3 has sequence homology with a saccharomyces cerevisiaeprotein (g130221 1) and is found in cDNA libraries which haveproliferating or immortalized cells.

[0066] HRM-4 (SEQ ID NO:4) was identified in Incyte Clone 9337 from theHMC1NOT01 cDNA library using a computer search for amino acid sequencealignments. A consensus sequence, SEQ ID NO:53, was derived from theextended and overlapping nucleic acid sequences: Incyte Clones 9337(HMC1NOT01), 670279 (CRBLNOT01), 717305 (PROSTUT01), 968249 (BRSTNOT05),and 1546506 (PROSTUT04).

[0067] In one embodiment, the invention encompasses a protein comprisingthe amino acid sequence of SEQ ID NO:4. HRM-4 is 338 amino acids inlength and has a potential N glycosylation site at N327, 11 potentialphosphorylation sites at T15, S36, S42, S50, T51, S73, S144, S176, T256,S140, and T329; and five zinc finger motifs at C₁₉₂RC₁₉₄SECGKIFRNPRYFSVHKKIH, C₂₂₂QDCGKGFVQSSSLTQHQRVH, C₂₅₀OQECGRTFNDRSAISQHLRTH,C₂₇₈QDCGKAFRQSSHLIRHQRTH, and C₃₀₆NKCGKAFTQSSHLIGHQRTH. HRM-4 hassequence homology with a human zinc finger protein (g1613852) and isfound in cDNA libraries which have proliferating, cancerous, orsecretory cells.

[0068] HRM-5 (SEQ ID NO:5) was identified in Incyte Clone 9476 from theHMCINOT01 cDNA library using a computer search for amino acid sequencealignments. A consensus sequence, SEQ ID NO:54, was derived from theextended and overlapping nucleic acid sequences: Incyte Clones 9476(HMC1NOT01), 010403 (THP1PLB01), 495099 (HNT2NOT01), 1670783(BMARNOT03), 1997203 (BRSTTUT03), and 2190637 (THYRTUT03).

[0069] In one embodiment, the invention encompasses a protein comprisingthe amino acid sequence of SEQ ID NO:5. HRM-5 is 456 amino acids inlength and has a potential N glycosylation site at N385; 14 potentialphosphorylation sites at T9, T12, S58, T74, T163, T139, S175, T211,T239, T272, S331, T367, T420, and S443, and an ATP/GTP binding motif atG₇₀PPGTGKT77. HRM-5 has sequence homology with a S. cerevisiae protein(g755784) and is found in cDNA libraries which have dividing, cancerousor immortalized cells and are associated with immune response.

[0070] HRM-6 (SEQ ID NO:6) was identified in Incyte Clone 10370 from theTHP1PLB01 cDNA library using a computer search for amino acid sequencealignments. A consensus sequence, SEQ ID NO:55, was derived from theextended and overlapping nucleic acid sequences: Incyte Clones 010370(THP1PLB01), 109018 (AMLBNOT01), 259388 (HNT2RAT01), and 1518624(BLADTUT04).

[0071] In one embodiment, the invention encompasses a protein comprisingthe amino acid sequence of SEQ ID NO:6. HRM-6 is 210 amino acids inlength and has one potential N-glycosylation site at N11 and ninepotential phosphorylation sites at T13, T21, T46, T124, S125, S132,T143, T167, and T191. HRM-6 has sequence homology with a putative p64CLCP human protein (g895845) and is found in cDNA libraries which havedividing, cancerous or immortalized cells and are associated with immuneresponse.

[0072] HRM-7 (SEQ ID NO:7) was identified in Incyte Clone 30137 from theTHP1PLB01 cDNA library using a computer search for amino acid sequencealignments. A consensus sequence, SEQ ID NO:56, was derived from theextended and overlapping nucleic acid sequences: Incyte Clones 30137(THP1NOB01), 531638 (BRAINOT03), 1653122 (PROSTUT08), and 1682227(PROSNOT15).

[0073] In one embodiment, the invention encompasses a protein comprisingthe amino acid sequence of SEQ ID NO:7. HRM-7 is 255 amino acids inlength and has one potential N glycosylation site at N86 and 12potential phosphorylation sites at T9, T28, S32, S61, S94, S142, S156,S160, T169, S118, S220, and S236. HRM-7 has sequence homology with humanclone 23733 (g1710241) and is found in cDNA libraries which havedividing, cancerous or immortalized cells and are associated with immuneresponse.

[0074] HRM-8 (SEQ ID NO:8) was identified in Incyte Clone 77180 from theSYNORAB01cDNA library using a computer search for amino acid sequencealignments. A consensus sequence, SEQ ID NO:57, was derived from theextended and overlapping nucleic acid sequences: Incyte Clones 077180(SYNORAB01), 604706 (BRSTTUT01), 977901 (BRSTNOT02), 1870373(SKINBIT01), and 2169441 (ENDCNOT03).

[0075] In one embodiment, the invention encompasses a protein comprisingthe amino acid sequence of SEQ ID NO:8. HRM-8 is 188 amino acids inlength and has one potential amidation site, Q170GKR; two potential Nglycosylation sites at N60 and N68; and four potential phosphorylationsites at S70, T164, T166, and S183. HRM-8 has sequence homology with aS. cerevisiae protein (g5372) and is found in cDNA libraries which havedividing, cancerous or immortalized cells and are associated with immuneresponse.

[0076] HRM-9 (SEQ ID NO:9) was identified in Incyte Clone 98974 from thePITUNOR01 cDNA library using a computer search for amino acid sequencealignments. A consensus sequence, SEQ ID NO:58, was derived from theextended and overlapping nucleic acid sequences: Incyte Clones 98974(PITUNOR01), 443924 (MPHGNOT03), 1401540 (BRAITUT08), 1507305(BRAITUT07), 1700814 (BLADTUT05), and 1809947 (PROSTUT12).

[0077] In one embodiment, the invention encompasses a protein comprisingthe amino acid sequence of SEQ ID NO:9. HRM-9 is 531 amino acids inlength and has one potential N glycosylation site at N480; 37 potentialphosphorylation sites at S19, T22, S38, T64, T76, T91, Si117, Si118,S158, T164, T177, T182, T200, T267, Y281, Y311, Y322, S333, S394, S402,S404, S409, S414, S416, S418, S429 S434, S439, S440, S456, S460, S466,S478, S505, S510, S524, S528, and one potential glycosaminoglycan motifat S434GSG. HRM-9 sequence homology with a Caenorhabditis elegansprotein (g1627704) and is found in cDNA libraries which have secretory,proliferating or immune cells.

[0078] HRM-10 (SEQ ID NO: 10) was identified in Incyte Clone 118160 fromthe MUSCNOT01 cDNA library using a computer search for amino acidsequence alignments. A consensus sequence, SEQ ID NO:59, was derivedfrom the extended and overlapping nucleic acid sequences: Incyte Clones118160 (MUSCNOT01), 323015 (EOSIHET02), and 1856519 (PROSNOT18).

[0079] In one embodiment, the invention encompasses a protein comprisingthe amino acid sequence of SEQ ID NO:10. HRM-10 is 348 amino acids inlength and has two potential N glycosylation sites at N150 and N317; 17potential phosphorylation sites at T23, T45, S60, T126, S130, S140,S145, S151, S154, S158, S186, Y208, Y234, S217, T271, T303, and S327,and a transcription factor signature at C₃₁₀SKCKKKNCTYNQVQTRSADEPMTTFVLCNEC. HRM-10 has sequence homology with a Mus musculus protein(g220594) and is found in cDNA libraries which have secretory or immuneassociations.

[0080] HRM-11 (SEQ ID NO:11) was identified in Incyte Clone 140516 fromthe TLYMNOR01 cDNA library using a computer search for amino acidsequence alignments. A consensus sequence, SEQ ID NO:60, was derivedfrom the extended and overlapping nucleic acid sequences: Incyte Clones140516 (TLYMNOR01), 143729 (TLYMNOR01), 1346014 (PROSNOT11), and 2074866(ISLTNOT01).

[0081] In one embodiment, the invention encompasses a protein comprisingthe amino acid sequence of SEQ ID NO:11. HRM-11 is 393 amino acids inlength and has 14 potential phosphorylation sites at S22, T33, S41, S69,T156, Y157, S166, S199, T242, T308, T324, S350, T359, S378. HRM-11 hassequence homology with a C. elegans protein (g1086723) and is found incDNA libraries which have proliferating, secretory or immune cells.

[0082] HRM-12 (SEQ ID NO: 12) was identified in Incyte Clone 207452 fromthe SPLNNOT02 cDNA library using a computer search for amino acidsequence alignments. A consensus sequence, SEQ ID NO:61, was derivedfrom the extended and overlapping nucleic acid sequences: Incyte Clones207452 (SPLNNOT02), 238306 (SINTNOT02), 1559492 (SPLNNOT04), and 1852567(LUNGFET03).

[0083] In one embodiment, the invention encompasses a protein comprisingthe amino acid sequence of SEQ ID NO:12. HRM-12 is 320 amino acids inlength and one potential amidation site at E₂₁₀GKK; two potential Nglycosylation sites atN12 and N314; seven potential phosphorylationsites at S34, S51, S56, Slll, T157, S198, and S318; one potentialglycosaminoglycan motif, S224GAG; one immunoglobulin majorhistocompatibility motif, F₃₀₅FCNVFH; and two mitochondrial carrierprotein signatures, P₃₅FDVIKIRF and P₁₃₈VDVLRTRF. HRM-12 has sequencehomology with a S. cerevisiae protein (gl31⁴086) and is found in cDNAlibraries which have secretory and proliferating cells.

[0084] HRM-13 (SEQ ID NO: 13) was identified in Incyte Clone 208836 fromthe SPLNNOT02 cDNA library using a computer search for amino acidsequence alignments. A consensus sequence, SEQ ID NO:62, was derivedfrom the extended and overlapping nucleic acid sequences: Incyte Clones26879 (SPLNFET01), 208836 (SPLNNOT02), and 1916142 (PROSTUT04).

[0085] In one embodiment, the invention encompasses a protein comprisingthe amino acid sequence of SEQ ID NO:13. HRM-13 is 343 amino acids inlength and has one potential N glycosylation site at N172; 17 potentialphosphorylation sites at S45, S46, T62, S73, S84, S85, S102, S105, T124,S137, Y153, T192, S216, Y226, Y241, S253 and T293; and a zinc fingermotif at C₂₇₇RHYFCESCA. HRM-13 has sequence homology with a S.cerevisiae protein (g662126) and is found in cDNA libraries which haveproliferating cells and are associated with immune response.

[0086] HRM-14 (SEQ ID NO:14) was identified in Incyte Clone 569710 fromthe MMLR3DT01 cDNA library using a computer search for amino acidsequence alignments. A consensus sequence, SEQ ID NO:63, was derivedfrom the extended and overlapping nucleic acid sequences: Incyte Clones145344 (TLYMNOR01) and 569710 (MMLR3DT01).

[0087] In one embodiment, the invention encompasses a protein comprisingthe amino acid sequence of SEQ ID NO:14. HRM-14 is 368 amino acids inlength and has 10 potential phosphorylation sites at S5, T16, T125,S132, S142, S157, S167, S185, S208, and S246; and four zinc fingermotifs at C₂₅₃DECGKHFSQGSALILHQRIH, C₂₈₁,VECGKAFSRSSILVQH QRVH,C₃₀₉LECGKAFSQNSGLINHQRIH, and C₃₃₇VQCGKSYSQSSNLFRHQRRH. HRM-14 hassequence homology with a human zinc finger protein (gl698719) and isfound in cDNA libraries which are associated with immune response.

[0088] HRM-15 (SEQ ID NO:15) was identified in Incyte Clone 606742 fromthe BRSTTUT01 cDNA library using a computer search for amino acidsequence alignments. A consensus sequence, SEQ ID NO:64, was derivedfrom the extended and overlapping nucleic acid sequences: Incyte Clones606742 (BRSTIUT01) and 1559478 (SPLNNOT04).

[0089] In one embodiment, the invention encompasses a protein comprisingthe amino acid sequence of SEQ ID NO:15. HRM-15 is 158 amino acids inlength and has two potential myristylation sites, G92GFHGQ and G96QMHSR,and one potential PKC phosphosphorylation site, S40. HRM-15 has sequencehomology with human clone 23679 (g1710201) and is found in cDNAlibraries with proliferating, secretory and/or cancerous cells.

[0090] HRM-16 (SEQ ID NO: 16) was identified in Incyte Clone 611135 fromthe COLNNOT01 cDNA library using a computer search for amino acidsequence alignments. A consensus sequence, SEQ ID NO:65, was derivedfrom the extended and overlapping nucleic acid sequences: Incyte Clones611135 (COLNNOT01), 659029 (BRAINOT03), and 1861691 (PROSNOT19).

[0091] In one embodiment, the invention encompasses a protein comprisingthe amino acid sequence of SEQ ID NO:16. HRM-16 is 334 amino acids inlength and has 11 potential phosphorylation sites at S17, T29, T128,S133, S162, S176, S263, T257, S263, S277, and S294. HRM-16 has sequencehomology with a C. elegans protein (g506882) and is found in cDNAlibraries with secretory cells.

[0092] HRM-17 (SEQ ID NO: 17) was identified in Incyte Clone 641127 fromthe BRSTNOT03 cDNA library using a computer search for amino acidsequence alignments. A consensus sequence, SEQ ID NO:66, was derivedfrom the extended and overlapping nucleic acid sequences: Incyte Clones641127 (BRSTNOT03) and 673153 (CRBLNOT01).

[0093] In one embodiment, the invention encompasses a protein comprisingthe amino acid sequence of SEQ ID NO: 17. HRM-17 is 488 amino acids inlength and has one N glycosylation site at N215; 11 potentialphosphorylation sites at S70, S78, S92, T102, S111, T190, Y235, S303,S329, S415, and T471; and eight zinc finger motifs atC₂₃₇EQCGKGFTRSSSLLIHQAVH, C₂₆₅DKCGKGFTRSSSLLIHHAVH,C₂₉₃DKCGKGFSQSSKLHIHQRVH, C₃₂₁,EECGMSFS QRSNLHIHQRVH,C₃₄₉GECGKGFSQSSNLHIHRCIH, C₃₇₇YECGKGFSQSSDLRIHLRVH,C₄₀₅GKCGKGFSQSSKLLIHQRVH, and C₄₃₃SKCGKGFSQSSNLHIHQRVH. HRM-17 hassequence homology with a human HOK-2 gene product (g1310668) and isfound in cDNA libraries associated with sensory and secretory functions.

[0094] HRM-18 (SEQ ID NO:18) was identified in Incyte Clone 691768 fromthe LUNGTUT02 cDNA library using a computer search for amino acidsequence alignments. A consensus sequence, SEQ ID NO:67, was derivedfrom the extended and overlapping nucleic acid sequences: Incyte Clones691768 (LUNGTUT02), 1417161 (BRAINOT12) and 1931861 (COLNNOT16).

[0095] In one embodiment, the invention encompasses a protein comprisingthe amino acid sequence of SEQ ID NO:18. HRM-18 is 255 amino acids inlength and has one potential N glycosylation site at N102 and 13potential phosphorylation sites at S21, T90, T109, S111, T124, S134,S139, T141, S158, S172, S181, S187, and T206. HRM-18 has sequencehomology with a M. musculus protein (g309183) and is found in cDNAlibraries with proliferating or cancerous cells.

[0096] HRM-19 (SEQ ID NO:19) was identified in Incyte Clone 724157 fromthe SYNOOAT01 cDNA library using a computer search for amino acidsequence alignments. A consensus sequence, SEQ ID NO:68, was derivedfrom the extended and overlapping nucleic acid sequences: Incyte Clones724157 (SYNOOAT01), 1516153 (PANCTUT01), and 1610152 (COLNTUT06).

[0097] In one embodiment, the invention encompasses a protein comprisingthe amino acid sequence of SEQ ID NO: 19. HRM-19 is 351 amino acids inlength and has eight potential phosphorylation sites at T30, S41, S53,T135, S172, S187, T273, and S331; one potential glycosaminoglycan site,S₁₈GTG; and one potenti mitochondrial carrier motif, P₁₃,LDVVKVRL.HRM-19 has sequence homology with C. elegans C16C10 (g577542) and isfound in cDNA libraries associated with cell proliferation, cancer andimmune response.

[0098] HRM-20 (SEQ ID NO:20) was identified in Incyte Clone 864683 fromthe BRAITUT03 cDNA library using a computer search for amino acidsequence alignments. A consensus sequence, SEQ ID NO:69, was derivedfrom the extended and overlapping nucleic acid sequences: Incyte Clones486297 (HNT2RATO1), 864683 (BRAITUT03), 1314465 (BLADTUT02), 1610776(COLNTUT06), 1856771 (PROSNOT18), 1866081 (PROSNOT19), 1932221(COLNNOT16), and 2125225 (BRSTNOT07).

[0099] In one embodiment, the invention encompasses a protein comprisingthe amino acid sequence of SEQ ID NO:20. HRM-20 is 535 amino acids inlength and has three potential N glycosylation sites at N202, N252, andN523; and 17 potential phosphorylation sites at S2, S12, S42, S49, S102,S157, T165, T171, T232, T255, T317, S332, S428, T441, S453, S500, andS509. HRM-20 has sequence homology with a C. elegans protein (g1418563)and is found in cDNA libraries associated with cell proliferation,cancer and immune response. HRM-21 (SEQ ID NO:21)

[0100] Was identified in Incyte Clone 933353 from the CERVNOT01 cDNAlibrary using a computer search for amino acid sequence alignments. Aconsensus sequence, SEQ ID NO:70, was derived from the extended andoverlapping nucleic acid sequences: Incyte Clones 928904 (BRAINOT04),933353 (CERVNOT01), and 2452674 (ENDANOT01).

[0101] In one embodiment, the invention encompasses a protein comprisingthe amino acid sequence of SEQ ID NO:21. HRM-21 is 201 amino acids inlength and has one potential N glycosylation site at N82; five potentialphosphorylation sites at T70, S83, S98, S154, and Ti 87; and onetyrosine phosphatase motif at V₁₃₀HCKAGRSRSATM. HRM-21 has sequencehomology with a C. elegans protein (g1657672) and is found in cDNAlibraries associated with immune response. HRM-22 (SEQ ID NO:22)

[0102] Was identified in Incyte Clone 1404643 from the LATRTUT02 cDNAlibrary using a computer search for amino acid sequence alignments. Aconsensus sequence, SEQ ID NO:71, was derived from the extended andoverlapping nucleic acid sequences: Incyte Clones 878243 (LUNGAST01),1404643 (LATRTUT02), 1508343 (LUNGNOT14) and 2585156 (BRAITUT22).

[0103] In one embodiment, the invention encompasses a protein comprisingthe amino acid sequence of SEQ ID NO:22. HRM-22 is 239 amino acids inlength and has four potential phosphorylation sites at S5, S89, S133,and T211. HRM-22 has sequence homology with a C. elegans protein(g459002) and is found in cDNA libraries associated with cellproliferation, cancer and immune response. HRM-23 (SEQ ID NO:23)

[0104] Was identified in Incyte Clone 1561587 from the SPLNNOT04 cDNAlibrary using a computer search for amino acid sequence alignments. Aconsensus sequence, SEQ ID NO:72, was derived from the extended andoverlapping nucleic acid sequences: Incyte Clones 522573 (MMLR2DTO1),773822 (COLNNOT05), 1304839 (PLACNOT02), 1381253 (BRAITUT08), 1452511(PENITUT01), 1539060 (SINTIUT01), 1561587 (SPLNNOT04), and 2416572(HNT3AZTO1).

[0105] In one embodiment, the invention encompasses a protein comprisingthe amino acid sequence of SEQ ID NO:23. HRM-23 is 244 amino acids inlength and has five potential phosphorylation sites at T40, S75, T84,T89, and S194. HRM-23 has sequence homology with a C. elegans protein(g868266) and is found in cDNA libraries associated with cellproliferation, cancer and immune response. TABLE 1 Protein NucleotideAbbreviation Clone ID Library NCBI Homolog SEQ ID NO:1 SEQ ID NO:50HRM-1    133 U937NOT01 g285947 KIAA0105 SEQ ID NO:2 SEQ ID NO:51 HRM-2   1762 U937NOT01 g1518121 Ascaris suum SEQ ID NO:3 SEQ ID NO:52 HRM-3   1847 U937NOT01 g1302211 Saccharomyces cerevisiae SEQ ID NO:4 SEQ IDNO:53 HRM-4    9337 HMC1NOT01 g1613852 Human zinc finger protein (zf2)SEQ ID NO:5 SEQ ID NO:54 HRM-5    9476 HMC1NOT01 g755784 S. cerevisiaeSEQ ID NO:6 SEQ ID NO:55 HRM-6   10370 THP1PLB01 g895845 Human putativep64 CLCP protein SEQ ID NO:7 SEQ ID NO:56 HRM-7   30137 THP1NOB01g1710241 Human clone 23733 mRNA SEQ ID NO:8 SEQ ID NO:57 HRM-8   77180SYNORAB01 g5372 S. cerevisiae SEQ ID NO:9 SEQ ID NO:58 HRM-9   98974PITUNOR01 g1627704 Caenorhabditis elegans SEQ ID NO:10 SEQ ID NO:59HRM-10  118160 MUSCNOT01 g220594 Mus musculus SEQ ID NO:11 SEQ ID NO:60HRM-11  140516 TLYMNOR01 g1086723 C. elegans SEQ ID NO:12 SEQ ID NO:61HRM-12  207452 SPLNNOT02 g1314086 S. cerevisiae SEQ ID NO:13 SEQ IDNO:62 HRM-13  208836 SPLNNOT02 g662126 S. cerevisiae SEQ ID NO:14 SEQ IDNO:63 HRM-14  569710 MMLR3DT01 g1698719 Human zinc finger protein SEQ IDNO:15 SEQ ID NO:64 HRM-15  606742 BRSTTUT01 g1710201 Human clone 23679mRNA SEQ ID NO:16 SEQ ID NO:65 HRM-16  611135 COLNNOT01 g506882 Celegans SEQ ID NO:17 SEQ ID NO:66 HRM-17  641127 BRSTNOT03 g1310668Human Hok-2 gene product SEQ ID NO:18 SEQ ID NO:67 HRM-18  691768LUNGTUT02 g309183 Mus musculus SEQ ID NO:19 SEQ ID NO:68 HRM-19  724157SYNOOAT01 g577542 C. elegans C16C10 SEQ ID NO:20 SEQ ID NO:69 HRM-20 864683 BRAITUT03 g1418563 C. elegans SEQ ID NO:21 SEQ ID NO:70 HRM-21 933353 CERVNOT01 g1657672 C. elegans SEQ ID NO:22 SEQ ID NO:71 HRM-221404643 LATRTUT02 g459002 C. elegans SEQ ID NO:23 SEQ ID NO:72 HRM-231561587 SPLNNOT04 g868266 C. elegans SEQ ID NO:24 SEQ ID NO:73 HRM-241568361 UTRSNOT05 g1834503 Human mucin SEQ ID NO:25 SEQ ID NO:74 HRM-251572888 LNODNOT03 g603396 S. cerevisiae YER156c SEQ ID NO:26 SEQ IDNO:75 HRM-26 1573677 LNODNOT03 g849195 S. cerevisiae D9481.16 SEQ IDNO:27 SEQ ID NO:76 HRM-27 1574624 LNODNOT03 g1067025 C. elegans R07E5.14SEQ ID NO:28 SEQ ID NO:77 HRM-28 1577239 LNODNOT03 g728657 S. cerevisiaeSEQ ID NO:29 SEQ ID NO:78 HRM-29 1598203 BLADNOT03 g1200033 C. elegansF35G2 SEQ ID NO:30 SEQ ID NO:79 HRM-30 1600438 BLADNOT03 g286001KIAA0005 SEQ ID NO:31 SEQ ID NO:80 HRM-31 1600518 BLADNOT03 g790405 C.elegans SEQ ID NO:32 SEQ ID NO:81 HRM-32 1602473 BLADNOT03 g1574570Haemophilus influenzae SEQ ID NO:33 SEQ ID NO:82 HRM-33 1605720LUNGNOT15 g1055080 C. elegans SEQ ID NO:34 SEQ ID NO:83 HRM-34 1610501COLNTUT06 g313741 S. cerevisiae YBL0514 SEQ ID NO:35 SEQ ID NO:84 HRM-351720770 BLADNOT06 g1006641 C. elegans F46C5 SEQ ID NO:36 SEQ ID NO:85HRM-36 1832295 BRAINON01 g561637 Human enigma protein SEQ ID NO:37 SEQID NO:86 HRM-37 1990522 CORPNOT02 g558396 S. cerevisiae SEQ ID NO:38 SEQID NO:87 HRM-38 2098087 BRAITUT02 g1066284 Mus musculus uterine mRNA SEQID NO:39 SEQ ID NO:88 HRM-39 2112230 BRAITUT03 g861306 C. elegans SEQ IDNO:40 SEQ ID NO:89 HRM-40 2117050 BRSTTUT02 g687821 C. elegans SEQ IDNO:41 SEQ ID NO:90 HRM-41 2184712 SININOT01 g868241 C.elegans C56C10 SEQID NO:42 SEQ ID NO:91 HRM-42 2290475 BRAINON01 g733605 C. elegans SEQ IDNO:43 SEQ ID NO:92 HRM-43 2353452 LUNGNOT20 g1507666 Schizosaccharomycespombe SEQ ID NO:44 SEQ ID NO:93 HRM-44 2469611 THP1NOT03 g1495332 C.elegans SEQ ID NO:45 SEQ ID NO:94 HRM-45 2515476 LIVRTUT04 g1665790KIAA0262 SEQ ID NO:46 SEQ ID NO:95 HRM-46 2754573 THP1AZS08 g478990Human RNA binding protein SEQ ID NO:47 SEQ ID NO:96 HRM-47 2926777TLYMNOT04 g687823 C. elegans SEQ ID NO:48 SEQ ID NO:97 HRM-48 3217567TESTNOT07 g1841547 Human HLA class III region SEQ ID NO:49 SEQ ID NO:98HRM-49 3339274 SPLNNOT10 g1177434 Human mRNA

[0106] HRM-24 (SEQ ID NO:24) was identified in Incete Clone 1568361 fromthe UTRSNOT05 cDNA library using a computer search for amino acidsequence. A consensus sequence, SEQ ID NO;73, was derived from theextended and overlapping nucleic acid sequences: Incyte Clones 927874(BRAINOT04), 1255220 (MENITUT03), 1242340 (LUNGNOT03), 13495 (LATRTU02),1381263 (BRAITUT08), 1500028 (SINTBST01), 1568361 (UTRSNOT05), 1653237(PROSTUT08), 1975340 (PANCTUT02), and 3274608 (PROSBPT06).

[0107] In one embodiment, the invention encompasses a proteincompraising the amino acid sequence of SEQ ID NO:24, HRM-24 is 431 aminoacids in length and has five potential N glycosylation sites at N75, N95N171, N202, and N298; eight potential phosphorylation sites at S2, S3,T11, T13, S17, Y316, T375, and T415, and a leucine zipper motif,L₉₆SAFNNILSNLGYILLGLLFLL. HRM-24 has sequence homology with human mucin(g1834503) and is found cDNA libraries proliferating, cancerus orinflamed cells.

[0108] HRM-25 (SEQ ID NO;25) was identified in Incyte Clone 1572888 fromthe LNODNOT03 cDNA library using a computer search for amino acidsequence aligments. A consensus sequence SEQ ID NO:74 was derived fromthe extended and overlapping nucleic acid sequence: Incyte Clones1438142 (PANCNT08), 1572888 (LNODNOT03), and 1665075 (BRSTNOT09).

[0109] In one embodiment, the invention encompases a protein compraisingthe amino acid sequence of SEQ ID NO:25. HRM-25 is 376 amino acids inlength and has one N glycosylation site five potential phosphorylationsites at S111, T150, S151, T159, and, S196. HRM-25 has sequence homologywith S. cerevisiae YER156c (g603396) and is found in cDNA libraries withsecretory cells.

[0110] HRM-26 (SEQ ID NO:26) was identifie in Incyte Clone 1573677 fromthe LONDNOT03 cDNA library using a computer search for amino acidsequence aligments. A concensus sequence, SEQ ID NO:75, was derived fromthe extended and overlapping nucleic acid sequences: Incyte Clones040360 (TBLYNOT01), 065573 (PLACNOB01), 228382 (PANCNOT01), 1457788(COLNFET02), 1573677 (LNODNOT03), and 1854560 (HNT3AZT01)

[0111] In one embodiment, the invention encompasses a proteincompraising the amino acid sequence of SEQ ID NO:26 is 340 amino acidsin length and has one potential N glycosylation site at N213 and 13potential phosphorylation site at T10, S22, T53, T56, S160, S168, S170,S177, S201, S226 S297, S303, and T329. HRM-26 has sequence homology withS. cerevisiae D9481.16 (g849195) and its found in cNDA librariesassociated with secretion, immune response, and cancer.

[0112] HRM-27 (SEQ ID NO:27) was identified in Icyte Clone 1574624 fromthe LNODNOT03 cDNA library using a coomputer search for amino acidsequence aligments. A concesnsus sequence, SEQ ID NO:76 was derived fromthe extended and overlapping nucleic acid sequence: Incyte Clones 90012(HYPONOB01), 888491 (STOMTUT01), and 1574624 (LNODNOT03).

[0113] In one embodimen, the invention encompasses a protein comprisingthe amino acids sequence of SEQ ID NO:27. HRM-27 is 174 amino acids inlength and has one N glycosylation site at N51 and five potentialphosphorylation sites at S111, T150, S151, T159, and T196. HRM-27 hassequence homology with a C. elegants protein (g1067025) and is found incDNA libraries associated with secretion, immune responce and ancer.

[0114] HRM-28 (SEQ ID NO:28) was identified in Incyte Clones 1577239from the LNODNOT03 cDNA library using a computer search for amino acidsequence aligments. A concensus sequence, SEQ ID NO:77, was derived fromthe extended and overlapping nucleic acid sequences: Incyte Clones100565 (ADRENOT01), 1336693 (COLNNOT13 ), and 1577239 (LNODNOT03).

[0115] In one embodiment, the invention encompasses a protein comprisingthe amino acids sequence of SEQ ID NO:28, HRM-28 is 179 amino acids inlength and has one potential N glycosylation site at N60 and fivepotential phosphorylation site at Y61, S62, Y104, T136, and Y142. HRM-28has sequence homology with a S. cerevisiae protein (g728657) and isfound in cDNA libraries associated with sevretion and immune response.

[0116] HRM-29 (SEQ ID NO:29) was identified in Incyte Clone 1598203 fromthe BLADNOT-03 cDNA library using a computer search for amino acidsequence aligments. A consensus sequence, SEQ ID NO:78, was derived fromthe extended and overlapping nucleic acid sequences: Incyte Clone1598203 (BLADNOT03), 1697035 (COLNNOT23), and 1932332 (COLNNOT16).

[0117] In one embodiment, the invention encompasses a protein comprisingthe amino acid sequence of SEQ ID NO;29. HRM-29 is 205 amino acids inlength and has one potential N glycocylation site at N117 and fivepotential phosphorylation sites at T68, T118, S137, S140, and S159.HRM-29 has sequence homology with a C. elegants protein (g1200033) andis found in cDNA libraries associated with secretion.

[0118] HRM-30 (SEQ ID NO:30) was identified in Incyte Clone 1600438 fronthe BLADNOT03 cDNA library using a computer search for amino acidsequence aligments. A consensus sequence, SEQ ID NO:79, was derived fromthe extended and overlapping nucleic acid sequences: Incyte Clones835283 (PROSNOT07), 1600044 (BLADNOT03), 1600438 (BLANDOT03), and1922072 (BRSTTUT01).

[0119] In one embodiment, the invention encompasses a protein comprisingthe acid sequenceof SEQ ID NO:30. HRM-30 is 419 amino acids in lengthand has one potential N glycosylation site at N161; twelve potentialphosphorylation sites at T16, S57, T67, T83, S100, T107, S144, S206,T254, Y351, S412, and S414; a leucine zipper motif,l₃₈NEAGDDLEAVAKFLDSGSRL; and an ATP/GTP binding motif, A₃₈₅HVAKGKS.HRM-30 has sequence homology with human KIAA0005 (g286001) and is foundin cDna libraries associated with secretion and cancer.

[0120] HRM-31 (SEQ ID NO:31) was identified in Incyte Clone 1600518 fromthe BLADNOT03 cNDA library using a computer search for amino acidsequence aligment. A consensus sequence, SEQ ID NO;80, was derived fromthe extended and overlapping nucleic acid sequences: Incyte Clones389679 (THYMNOT02), 1600518 (BLADNOT03), 2055734 (BEPINOT02), and2509270 CONUTUT01).

[0121] In one embodiment, the invention encompasses a protein comprisingthe amino acid sequence of SEQ ID NO:31. HRM-31 is 376 amino acid inlength and has one potential N glycosylation site at N161 and 14potential phosphorylation sites at T30, S65, S75, S95, S106, T134, S159,S224, T228, T250, T292, S299, T303, and S323 and a glycosaminoglycanmotif, S14 GPG. HRM-31 has sequence homology with a C. elegants protein(g790405) and is found in cNDA libraries associated with immuneresponse, secretion and cancer.

[0122] HRM-32 (SEQ ID NO:32) was identified in Incyte Clone 1602473 fromthe BLADNOT03 cDNA library using a computer search for amino acidsequence aligments. A consensus sequence, SEQ ID NO:81, was derived fromthe extended and overlapping nucleic acid sequences: Incyte Clones1351857 (LATRTUT02), 1602473 (BLADNOT03), and 2478778 (SMCANOT01).

[0123] In one embodiment, the invention encompasses a protein comprisingthe amino acid sequence of SEQ ID NO:32. HRM-32 is 237 amino acids inlength and has seven potential phosphorylation site at T51, T68, S92,S143, T171, S193, and S203. HRM-32 has sequence homology with aHaemophilus influenzae protein (g1574570) and is found in cDNA librariesassociated with immune response, and cancer.

[0124] HRM-33 (SEQ ID NO:33) was identified in Incyte Clone 16057220from the LUNGNOT15 cDNA library using a computer search for amino acidsequence aligment. A consensus sequence, SEQ ID NO:82, was derived fromthe extended and overlapping nucleic acid sequences: Incyte Clones660915 (BRAINOT03), 1347135 (PROSNOT11), and 1605720 (LUNGNOT15).

[0125] In one embodiment, the invention encompasses a protein comprisingthe amino acid sequence of SEQ ID NO:33. HRM-33 is 152 amino acids inlength and has four potential phosphorylation sites at S10, S23, T34,and S66; and a leucine zipper motif, L₇₇AVGNYRLKEYEKALKYVRGLL. HRM-33has sequence homology with C. elegans (g155080) and is found in cDNAlibraries associated with secretion and immune response.

[0126] HRM-34 (SEQ ID NO:34) was identified in Incyte Clone 1610501 fromthe COLNTUT06 cDNA library using a computer search for amino acidsequence alignments. A consensus sequence, SEQ ID NO:83, was derivedfrom the extended and overlapping nucleic acid sequences: Incyte Clones1610501 (COLNTUT06) and 2477716 (SMCANOT01).

[0127] In one embodiment, the invention encompasses a protein comprisingthe amino acid sequence of SEQ ID NO:34. HRM-34 is 179 amino acids inlength and has five potential phosphorylation sites at S32, S48, T45,T50, and T52. HRM-34 has sequence homology with a S. cerevisiae protein(g313741) and is found in cDNA libraries associated with cancer andimmune response.

[0128] HRM-35 (SEQ ID NO:35) was identified in Incyte Clone 1720770 fromthe BLADNOT06 cDNA library using a computer search for amino acidsequence alignments. A consensus sequence, SEQ ID NO:84, was derivedfrom the extended and overlapping nucleic acid sequences: Incyte Clones681455 (UTRSNOT02), 813292 (LUNGNOT04), 1223029 (COLNTUT02), 1444186(THYRNOT03), 1522592 (BLADTUT04), 1720770 (BLADNOT06), and 1798409(COLNNOT27).

[0129] In one embodiment, the invention encompasses a protein comprisingthe amino acid sequence of SEQ ID NO:35. HRM-35 is 196 amino acids inlength and has an amidation motif, H₁₇9GKR, and seven potentialphosphorylation sites at S2, S6, S31,S84, S90, T136, and T161. HRM-35has sequence homology with a C.elegans protein (g1006641) and is foundin cDNA libraries associated with secretion, immune response, andcancer.

[0130] HRM-36 (SEQ ID NO:36) was identified in Incyte Clone 1832295 fromthe BRAINON01 cDNA library using a computer search for amino acidsequence alignments. A consensus sequence, SEQ ID NO:85, was derivedfrom the extended and overlapping nucleic acid sequences: Incyte Clones060275 (LUNGNOT01), 1823989 (GBLATUT01), and 1832295 (BRAINON01).

[0131] In one embodiment, the invention encompasses a protein comprisingthe amino acid sequence of SEQ ID NO:36. HRM-36 is 612 amino acids inlength and has 12 potential N glycosylation sites at N36, N95, N139,N146, N151, N176, N188, N226, N243, N353, N371, and N482; and 16 potentat S58, S92, S112, T153, T198, T248, S308, S373, T400, T420, T428, Y438,T458, T472, S527, and S556. HRM-36 has sequence homology with humanenigma protein (g561⁶37) and is found in cDNA libraries associated withsecretion and immune response.

[0132] HRM-37 (SEQ ID NO:37) was identified in Incyte Clone 1990522 fromthe CORPNOT02 cDNA library using a computer search for amino acidsequence alignments. A consensus sequence, SEQ ID NO:86, was derivedfrom the extended and overlapping nucleic acid sequences: Incyte Clones264363 (HNT2AGTO0), 1990522 (CORPNOT02), and 2451448 (ENDANOT01).

[0133] In one embodiment, the invention encompasses a protein comprisingthe amino acid sequence of SEQ ID NO:37. HRM-37 is 101 amino acids inlength and has a PKC phosphorylation site at S62. HRM-37 has sequencehomology with a S. cerevisiae protein (g558396) and is found in cDNAlibraries associated with immune response.

[0134] HRM-38 (SEQ ID NO:38) was identified in Incyte Clone 2098087 fromthe BRAITUT02 cDNA library using a computer search for amino acidsequence alignments. A consensus sequence, SEQ ID NO:87, was derivedfrom the extended and overlapping nucleic acid sequences: Incyte Clones690359 (LUNGTUT02), 1429907 (SINTBST01), and 2098087 (BRAITUT02).

[0135] In one embodiment, the invention encompasses a protein comprisingthe amino acid sequence of SEQ ID NO:38. HRM-38 is 132 amino acids inlength and has a potential ATP/GTP binding motif at G₇₄ARNLLKS. HRM-38has sequence homology with M. musculus uterine protein (g166284) and isfound in cDNA libraries associated with immune response.

[0136] HRM-39 (SEQ ID NO:39) was identified in Incyte Clone 2112230 fromthe BRAITUT03 cDNA library using a computer search for amino acidsequence alignments. A consensus sequence, SEQ ID NO:88, was derivedfrom the extended and overlapping nucleic acid sequences: Incyte Clones1383278 (BRAITUT08), 1646103 (PROSTUT09), 2112230 (BRAITUT03), and2510591 (CONUTUT01).

[0137] In one embodiment, the invention encompasses a protein comprisingthe amino acid sequence of SEQ ID NO:39. HRM-39 is 188 amino acids inlength and has a potential N glycosylation site at N87 and eightpotential phosphorylation sites at T10, T28, S74, S93, T121, T128, Y168,and T169. HRM-39 homology with a C. elegans protein (g861306) and isfound in cDNA libraries from cancerous tissues.

[0138] HRM-40 (SEQ ID NO:40) was identified in Incyte Clone 2117050 fromthe BRAITUT02 cDNA library using a computer search for amino acidsequence alignments. A consensus sequence, SEQ ID NO:89, was derivedfrom the extended and overlapping nucleic acid sequences: Incyte Clones941515 (ADRENOT03), 1549443 (PROSNOT06), 2113261 (BRAITUT03), 2117050(BRSTTUT02), and 2530536 (GBLANOT02).

[0139] In one embodiment, the invention encompasses a protein comprisingthe amino acid sequence of SEQ ID NO:40. HRM-40 is 86 amino acids inlength and has a potential N glycosylation site at N58 and fourpotential phosphorylation sites at T2, S9, T26, and T27. HRM-40 hassequence homology with a C. elegans protein (g687821) and is found incDNA libraries involved in cell proliferation, secretion, cancer, andimmune response.

[0140] HRM-41 (SEQ ID NO:41) was identified in Incyte Clone 2184712 fromthe SININOT01 cDNA library using a computer search for amino acidsequence alignments. A consensus sequence, SEQ ID NO:90, was derivedfrom the extended and overlapping nucleic acid sequences: Incyte Clones922736 (RATRNOT02), 1976003 (PANCTUT02), and 2184712 (SININOT01).

[0141] In one embodiment, the invention encompasses a protein comprisingthe amino acid sequence of SEQ ID NO:41. HRM-41 is 222 amino acids inlength and has a potential amidation site, K₁₀GKK; a potentialglycosaminoglycan site, S₂GLG; a potential N glycosylation site, N95;and seven potential phosphorylation sites at T18, T29, T50, S84, T98,S112, and S188. HRM-41 has sequence homology with a C. elegant protein(g868241) and is found in cDNA libraries involved in cell proliferation,secretion, cancer, and immune response.

[0142] HRM-42 (SEQ ID NO:42) was identified in Incyte Clone 2290475 fromthe BRAINON01 cDNA library using a computer search for amino acidsequence alignments. A consensus sequence, SEQ ID NO:91, was derivedfrom the extended and overlapping nucleic acid sequences: Incyte Clones238339 (SINTNOT02), 1657945 (URETTUT01), 1848691 (LUNGFET03), 2044604(THPlT7T01), 2290475 (BRAINON01), and 2514944 (LIVRTUT04).

[0143] In one embodiment, the invention encompasses a protein comprisingthe amino acid sequence of SEQ ID NO:42. HRM-42 is 300 amino acids inlength and has a potential N glycosylation site, N5; seven potentialphosphorylation sites at S23, S71, S132, S142, T176, T192, and S293; anda Mutt signature, G₁₆₅MVDPGEKISATLKREFGEE. HRM-42 has sequence homologywith a C. elegans protein (g733605) and is found in cDNA librariesinvolved in cell proliferation, secretion, cancer, and immune response.

[0144] HRM-43 (SEQ ID NO:43) was identified in Incyte Clone 2353452 fromthe LUNGNOT20 cDNA library using a computer search for amino acidsequence alignments. A consensus sequence, SEQ ID NO:92, was derivedfrom the extended and overlapping nucleic acid sequences: Incyte Clones1000164 (BRSTNOT03), 1308080 (COLNFET02), 1900151 (BLADTUT06), and2353452 (LUNGNOT20).

[0145] In one embodiment, the invention encompasses a protein comprisingthe amino acid sequence of SEQ ID NO:43. HRM-43 is 112 amino acids inlength and has six potential phosphorylation sites at T23, T43, S44,T79, T84, and T98. HRM-43 has sequence homology with aSchizosaccharomvces pombe protein (gl507666) and is found in cDNAlibraries involved in cell proliferation, secretion, cancer, and immuneresponse.

[0146] HRM-44 (SEQ ID NO:44) was identified in Incyte Clone 2469611 fromthe THPlNOT03 cDNA library using a computer search for amino acidsequence alignments. A consensus sequence, SEQ ID NO:93, was derivedfrom the extended and overlapping nucleic acid sequences: Incyte clones003088 (HMClNOT01), 1448981 (PLACNOT02), 1453563 (PENITUT01), 1824146(GBLATUT01), 2369282 (ADRENOT07), 2469611 (THPlNOT03), and 2622587(KERANOT02).

[0147] In one embodiment, the invention encompasses a protein comprisingthe amino acid sequence of SEQ ID NO:44. HRM-44 is 251 amino acids inlength and has a potential glycosaminoglycan site, S218GFG, and fourpotential phosphorylation sites at T8, S83, S212, and S226. HRM-44 hassequence homology with a C. elegans protein (gl495332) and is found incDNA libraries involved in cell proliferation, secretion, cancer, andimmune response.

[0148] HRM-45 (SEQ ID NO:45) was identified in Incyte Clone 2515476 fromthe LIVRTUT04 cDNA library using a computer search for amino acidsequence alignments. A consensus sequence, SEQ ID NO:94, was derivedfrom the extended and overlapping nucleic acid sequences: Incyte clones18414 (HUVELPB01), 78341 (SYNORAB01), 143277 (TLYMNOR01), 181574(PLACNOB01), 832996 (PROSTUT04), 962753 (BRSTTUT03), 1413604(BRAINOT12), and 2515476 (LIVRTUT04).

[0149] In one embodiment, the invention encompasses a protein comprisingthe amino acid sequence of SEQ ID NO:45. HRM-45 is 811 amino acids inlength and has three potential amidation sites at G₁₁₃GRR, W₁₆₅GKR, andG₇₉₀GKK; four potential N glycosylation sites at N22, N56, N79, andN145; 24 potential phosphorylation sites at T11, S13, S30, S60, Y71,S81, S85, S86, S103, S254, S256, T377, S388, S425, S456, S487, T544,S552, S574, T659, S678, S702, S746, and S753; a potentialglycosaminoglycan site, S₁₆₀GHG; and a potential zinc finger motif atC₂₄₀GHIFCWACI. HRM-45 has sequence homology with human KIAA0262(g1665790) and is found in cDNA libraries involved in cellproliferation, secretion, cancer, and immune response.

[0150] HRM-46 (SEQ ID NO:46) was identified in Incyte Clone 2754573 fromthe THPlAZSO8 cDNA library using a computer search for amino acidsequence alignments. A consensus sequence, SEQ ID NO:95, was derivedfrom the extended and overlapping nucleic acid sequences: Incyte Clones263630 (HNT2AGTO1), 412307 (BRSTNOT01), 491644 (HNT2AGTO1), 1253094(LUNGFET03), 2270603 (PROSNON01), 2280508 (PROSNON01), 2375670(ISLTNOT01), 2754573 (THPlAZS08), and 3151587 (ADRENON04).

[0151] In one embodiment, the invention encompasses a protein comprisingthe amino acid sequence of SEQ ID NO:46. HRM-46 is 352 amino acids inlength and has two potential N glycosylation sites at N141 and N294, andthirteen potential phosphorylation sites at S8, T67, T106, T110, T121,S122, S169, S206, T210, S215, S256, S260, and T296. HRM-46 has sequencehomology with human RNA binding protein (g478990) and is found in cDNAlibraries involved in cell proliferation, secretion, and immuneresponse.

[0152] HRM-47 (SEQ ID NO:47) was identified in Incyte Clone 2926777 fromthe TLYMNOT04 cDNA library using a computer search for amino acidsequence alignments. A consensus sequence, SEQ ID NO:96, was derivedfrom the extended and overlapping nucleic acid sequences: Incyte Clones040208 (TBLYNOT01), 900242 (BRSTTUT03), 963500 (BRSTTUT03), 1996474(BRSTTUT03), and 2926777 (TLYMNOT04).

[0153] In one embodiment, the invention encompasses a protein comprisingthe amino acid sequence of SEQ ID NO:47. HRM-47 is 432 amino acids inlength and has a potential N glycosylation site at N417 and 24 potentialphosphorylation sites at T51, S73, T122, T133, S177, S206, T226, T238,S293, S300, S304, S309, T325, S333, S339, S353, S360, Y361, S384, S390,T403, T412, T419, and S425 homology with a C. elegans protein (g687823)and is found in cDNA libraries involved in cell proliferation,secretion, cancer, and immune response.

[0154] HRM-48 (SEQ ID NO:48) was identified in Incyte Clone 3217567 fromthe TESTNOT07 cDNA library using a computer search for amino acidsequence alignments. A consensus sequence, SEQ ID NO:97, was derivedfrom the extended and overlapping nucleic acid sequences: Incyte Clones905037 (COLNNOT07), 1287503 (BRAINOT11), and 3217567 (TESTNOT07).

[0155] In one embodiment, the invention encompasses a protein comprisingthe amino acid sequence of SEQ ID NO:48. HRM-48 is 180 amino acids inlength and has a potential zinc finger motif, C42GHLYCWPCL, and fivepotential phosphorylation sites at T33, T57, S84, T148, and S160. HRM-48has sequence homology with human HLA class III region (g1841547) and isfound in cDNA libraries involved in secretion and immune response.

[0156] HRM-49 (SEQ ID NO:49) was identified in Incyte Clone 3339274 fromthe SPLNNOT10cDNA library using a computer search for amino acidsequence alignments. A consensus sequence, SEQ ID NO:98, was derivedfrom the extended and overlapping nucleic acid sequences: Incyte Clones532254 (BRAINOT03), 941336 (ADRENOT03), 2447649 (THPlNOT03), and 3339274(SPLNNOT10).

[0157] In one embodiment, the invention encompasses a protein comprisingthe amino acid sequence of SEQ ID NO:49. HRM-49 is 137 amino acids inlength and has three potential phosphorylation sites at Ti 11, T91, andS119. HRM-49 has sequence homology with a deduced human translationalinhibitor (g1177434) and is found in cDNA libraries involved insecretion and immune response.

[0158] The invention also encompasses HRM variants which retain thebiological or functional activity of HRM. A preferred HRM variant is onehaving at least 60% amino acid sequence identity to an amino acidsequence selected from SEQ ID NOs: 1-49.

[0159] The invention also encompasses polynucleotides which encode HRM.Accordingly, any nucleic acid sequence which encodes the amino acidsequence of HRM can be used to produce recombinant molecules whichexpress HRM. In a particular embodiment, the invention encompasses apolynucleotide comprising a nucleic acid sequence selected from SEQ IDNOs:50-98 and fragment and complements thereof.

[0160] It will be appreciated by those skilled in the art that as aresult of the degeneracy of the genetic code, a multitude ofpolynucleotides encoding HRM, some bearing minimal homology to thepolynucleotides of any known and naturally occurring gene, may beproduced. Thus, the invention contemplates each and every possiblevariation of polynucleotide that could be made by selecting combinationsbased on possible codon choices. These combinations are made inaccordance with the standard triplet genetic code as applied to thepolynucleotide of naturally occurring HRM, and all such variations areto be considered as being specifically disclosed.

[0161] Although polynucleotides which encode HRM and its variants arepreferably capable of hybridizing to the polynucleotide of the naturallyoccurring HRM under selected conditions of stringency, it may beadvantageous to produce polynucleotides encoding HRM or its derivativespossessing a different codon usage. Codons may be selected to increasethe rate at which expression of the peptide occurs in a particularprokaryotic or eukaryotic host in accordance with the frequency withwhich particular codons are utilized by the host. Other reasons foraltering the polynucleotide encoding HRM and its derivatives withoutaltering the encoded amino acid sequences include the production of RNAtranscripts having more desirable properties, such as a greaterhalf-life, than transcripts produced from the naturally occurringsequence.

[0162] The invention also encompasses production of polynucleotides, orfragments thereof, which encode HRM and its derivatives, entirely bysynthetic chemistry. After production, the synthetic sequence may beinserted into any of the many available expression vectors and cellsystems using reagents that are well known in the art. Moreover,synthetic chemistry may be used to introduce mutations into a sequenceencoding HRM or any fragment thereof.

[0163] Also encompassed by the invention are polynucleotides that arecapable of hybridizing to the nucleic acids of a sample, and inparticular, the polynucleotides or the complements thereof shown in SEQID NOs:50-98, under various conditions of stringency as taught in Wahland Berger (1987; Methods Enzymol 152:399-407) and Kimmel (1987; MethodsEnzymol 152:507-511).

[0164] Methods for DNA sequencing which are well known and generallyavailable in the art and may be used to practice any of the embodimentsof the invention. The methods may employ such enzymes as the Klenowfragment of DNA polymerase I, SEQUENASE, Taq DNA polymerase andthermostable T7 DNA polymerase (Amersham Pharmacia Biotech (APB),Piscataway NJ), or combinations of polymerases and proofreadingexonucleases such as those found in the ELONGASE amplification system(Life Technologies, Gaithersburg MD). Preferably, the process isautomated with machines such as the MICROLAB system (Hamilton, Reno NV),DNA ENGINE thermal cycler (MJ Research, Watertown MA), and the Catalystpreparation and 373 and 377 PRISM DNA sequencing systems (ABI).

[0165] The nucleic acid sequences encoding HRM may be extended utilizinga partial nucleotide sequence and employing various methods known in theart to detect upstream sequences such as promoters and regulatoryelements. For example, one method which may be employed,“restriction-site” PCR, uses universal primers to retrieve unknownsequence adjacent to a known locus (Sarkar (1993) PCR Methods Applic2:318-322). In particular, genomic DNA is first amplified in thepresence of primer to a linker sequence and a primer specific to theknown region. The amplified sequences are then subjected to a secondround of PCR with the same linker primer and another specific primerinternal to the first one. Products of each round of PCR are transcribedwith an RNA polymerase and sequenced using reverse transcriptase.

[0166] Inverse PCR may also be used to amplify or extend sequences usingdivergent primers based on a known region (Triglia et al. (1988) NucleicAcids Res 16:8186). The primers may be designed using commerciallyavailable software such as OLIGO software (Molecular Insights, CascadeCO), or another program, to be 22-30 nucleotides in length, to have a GCcontent of 50% or more, and to anneal to the target sequence attemperatures about 68-72 C. The method uses several restriction enzymesto generate a fragment in the known region of a gene. The fragment isthen circularized by intramolecular ligation and used as a PCR template.

[0167] Another method which may be used is capture PCR which involvesPCR amplification of DNA fragments adjacent to a known sequence in humanand yeast artificial chromosome DNA (Lagerstrom et al. (1991) PCRMethods Applic 1:111-119). In this method, multiple restriction enzymedigestions and ligations may also be used to place an engineereddouble-stranded sequence into an unknown fragment of the DNA moleculebefore performing PCR.

[0168] Another method which may be used to retrieve unknown sequences isthat of Parker et al. (1991; Nucleic Acids Res 19:3055-3060). One mayalso use PCR, nested primers, and PROMOTERFINDER libraries (Clontech,Palo Alto CA) to walk genomic DNA. This process avoids the need toscreen libraries of cDNAs for longer sequences and is very useful infinding intron/exon junctions.

[0169] When screening for full-length cDNAs, it is preferable to uselibraries that have been size-selected to include larger cDNAs. Also,random-primed libraries are preferable, in that they will contain moresequences which contain the 5′ regions of genes. Use of a randomlyprimed library may be especially preferable for situations in which anoligo d(T) library does not yield a full-length cDNA. Genomic librariesmay be useful for extension of sequence into 5′ non-transcribedregulatory regions.

[0170] Capillary electrophoresis systems which are commerciallyavailable may be used to analyze the size or confirm the sequence ofsequencing or PCR products. In particular, capillary sequencing mayemploy flowable polymers for electrophoretic separation, four differentfluorescent dyes (one for each nucleotide) which are laser activated,and detection of the emitted wavelengths by a charge coupled devicecamera. Output/light intensity may be converted to electrical signalusing software integral to the system, and the entire process fromloading of samples to computer analysis and electronic data display maybe computer controlled. Capillary electrophoresis is especiallypreferable for the sequencing of small pieces of DNA which might bepresent in limited amounts in a particular sample.

[0171] In another embodiment of the invention, polynucleotides orfragments thereof which encode HRM may be used in recombinant DNAmolecules to direct expression of HRM, portions or functionalequivalents thereof, in host cells. Due to the inherent degeneracy ofthe genetic code, other DNA sequences which encode the same or afunctionally equivalent amino acid sequence may be produced, and thesesequences may be used to clone and express HRM.

[0172] As will be understood by those of skill in the art, it may beadvantageous to produce HRM-encoding polynucleotides possessingnon-naturally occurring codons. For example, codons preferred by aparticular prokaryotic or eukaryotic host can be selected to increasethe rate of protein expression or to produce an RNA transcript havingdesirable properties, such as a half-life which is longer than that of atranscript generated from the naturally occurring sequence.

[0173] The polynucleotides of the present invention can be engineeredusing methods generally known in the art in order to alter HRM encodingsequences for a variety of reasons, including but not limited to,alterations which modify the cloning, processing, and/or expression ofthe gene product. DNA shuffling by random fragmentation and PCRreassembly of gene fragments and synthetic oligonucleotides may be usedto engineer the polynucleotides. For example, site-directed mutagenesismay be used to insert new restriction sites, alter glycosylationpatterns, change codon preference, produce splice variants, introducemutations, and so forth.

[0174] In another embodiment of the invention, natural, modified, orrecombinant nucleic acid sequences encoding HRM may be ligated to aheterologous sequence to encode a fusion protein. For example, to screenpeptide libraries for inhibitors of HRM activity, it may be useful toencode a chimeric HRM protein that can be recognized by a commerciallyavailable antibody. A fusion protein may also be engineered to contain aa cleavage site located between the HRM encoding sequence and theheterologous protein sequence, so that HRM may be cleaved and purifiedaway from the heterologous moiety.

[0175] In another embodiment, sequences encoding HRM may be synthesized,in whole or in part, using chemical methods well known in the art(Caruthers et al. (1980) Nucleic Acids Symp Ser. (7) 215-223, Horn etal. (1980) Nucleic Acids Symp. Ser. (7) 225-232). Alternatively, theprotein itself may be produced using chemical methods to synthesize theamino acid sequence of HRM, or a portion thereof. For example, peptidesynthesis can be performed using various solid-phase techniques (Robergeet al. (1995) Science 269:202-204) and automated synthesis may beachieved, for example, using the 43 1A Peptide synthesizer (ABI).

[0176] The newly synthesized peptide may be purified by preparative highperformance liquid chromatography (see Creighton (1983) ProteinsStructures and Molecular Principles, WH Freeman, New York N.Y.). Thecomposition of the synthetic peptides may be confirmed by amino acidanalysis or sequencing (e.g., the Edman degradation procedure;Creighton, supra). Additionally, the amino acid sequence of HRM, or anypart thereof, may be altered during direct synthesis and/or combinedusing chemical methods with sequences from other proteins, or any partthereof, to produce a variant protein.

[0177] In order to express a biologically active HRM, thepolynucleotides encoding HRM or functional equivalents, may be insertedinto expression vector, i.e., a vector which contains the necessaryelements for the transcription and translation of the inserted codingsequence.

[0178] Methods which are well known to those skilled in the art may beused to construct expression vectors containing sequences encoding HRMand transcriptional and translational control elements. These methodsinclude in vitro recombinant DNA techniques, synthetic techniques, andin vivo genetic recombination. Such techniques are described in Sambrooket al. (1989; Molecular Cloning, A Laboratory Manual, Cold Spring HarborPress, Plainview N.Y.) and Ausubel et al. (1989; Current Protocols inMolecular Biology, John Wiley & Sons, New York N.Y.).

[0179] A variety of expression vector/host systems may be utilized tocontain and express sequences encoding HRM. These include, but are notlimited to, microorganisms such as bacteria transformed with recombinantbacteriophage, plasmid, or cosmid DNA expression vectors; yeasttransformed with yeast expression vectors; insect cell systems infectedwith baculovirus expression vectors; plant cell systems transformed withvirus expression vectors (e.g., cauliflower mosaic virus or tobaccomosaic virus) or with bacterial expression vectors (Ti or pBR322plasmids); or animal cell systems. The invention is not limited by thehost cell employed.

[0180] The “control elements” or “regulatory sequences” are thosenon-translated regions of the vector enhancers, promoters, 5′ and 3′untranslated regions--which interact with host cellular proteins tocarry out transcription and translation. Such elements may vary in theirstrength and specificity. Depending on the vector system and hostutilized, any number of transcription and translation elements,including constitutive and inducible promoters, may be used. Forexample, when cloning in bacterial systems, inducible promoters such asthe hybrid lacZ promoter of the BLUESCRIPT phagemid (Stratagene, LaJollaCA) or the pSport1 plasmid (Life Technologies) may be used. Thebaculovirus polyhedrin promoter may be used in insect cells. Promotersor enhancers derived from the genomes of plant cells (e.g., heat shock,RUBISCO; and storage protein genes) or from plant viruses (e.g., viralpromoters or leader sequences) may be cloned into the vector if theprotein is to be produced in plant cells. In mammalian cell systems,promoters from mammalian genes or from mammalian viruses are preferable.If it is necessary to generate a cell line that contains multiple copiesof the sequence encoding HRM, vectors based on SV40 or EBV may be usedwith an selectable marker.

[0181] In bacterial systems, a number of expression vectors may beselected depending upon the use intended for HRM. For example, whenlarge quantities of HRM are needed for the induction of antibodies,vectors which direct high level expression of fusion proteins that arereadily purified may be used. Such vectors include, but are not limitedto, the multifunctional E. coli cloning and expression vectors such asBLUESCRIPT phagemid (Stratagene), in which the sequence encoding HRM maybe ligated into the vector in frame with sequences for theamino-terminal Met and the subsequent 7 residues of β-galactosidase sothat a hybrid protein is produced; pIN vectors (Van Heeke and Schuster(1989) J Biol Chem 264:5503-5509); and the like. pGEX vectors (APB) mayalso be used to express foreign proteins as fusion proteins withglutathione S-transferase (GST). In general, such fusion proteins aresoluble and can easily be purified from lysed cells by adsorption toglutathione-agarose beads followed by elution in the presence of freeglutathione. Proteins made in such systems may be designed to includeheparin, thrombin, or factor XA protease cleavage sites so that thecloned protein of interest can be released from the GST moiety at will.

[0182] In the yeast, Saccharomyces cerevisiae, a number of vectorscontaining constitutive or inducible promoters such as alpha factor,alcohol oxidase, and PGH may be used. For reviews, see Ausubel (supra)and Grant et al. (1987; Methods Enzymol 153:516-544).

[0183] In cases where plant cell expression is desired, the expressionof sequences encoding HRM may be driven by any of a number of promoters.For example, viral promoters such as the 35S and 19S promoters of CaMVmay be used alone or in combination with the omega leader sequence fromTMV (Takamatsu (1987) EMBO J 6:307-311). Alternatively, plant promoterssuch as the small subunit of RUBISCO or heat shock promoters may be used(Coruzzi et al. (1984) EMBO J 3:1671-1680; Broglie et al. (1984) Science224:838-843; and Winter et al. (1991) Results Probl Cell Differ17:85-105). These constructs can be introduced into plant cells bydirect DNA transformation or pathogen-mediated transfection. Suchtechniques are described in a number of generally available reviews(see, for example, Hobbs or Murry, hi: McGraw Hill Yearbook of Scienceand Technolog (1992) McGraw Hill, New York N.Y.; pp. 191-196).

[0184] An insect system may also be used to express HRM. For example, inone such system, Autographa californica nuclear polyhedrosis virus(AcNPV) is used as a vector to express foreign genes in Spodopterafrugiperda cells or in Trichoplusia larvae. The sequences encoding HRMmay be cloned into a non-essential region of the virus, such as thepolyhedrin gene, and placed under control of the polyhedrin promoter.Successful insertion of HRM will render the polyhedrin gene inactive andproduce recombinant virus lacking coat protein. The recombinant virusesmay then be used to infect, for example, S. frugiperda cells orTrichoplusia larvae in which HRM may be expressed (Engelhard et al.(1994) Proc Nat Acad Sci 91:3224-3227).

[0185] In mammalian host cells, a number of viral-based expressionsystems may be utilized. In cases where an adenovirus is used as anexpression vector, sequences encoding HRM may be ligated into anadenovirus transcription/translation complex consisting of the latepromoter and tripartite leader sequence. Insertion in a non-essential Elor E3 region of the viral genome may be used to obtain a viable viruswhich is capable of expressing HRM in infected host cells (Logan andShenk (1984) Proc Natl Acad Sci 81:3655-3659). In addition,transcription enhancers, such as the Rous sarcoma virus (RSV) enhancer,may be used to increase expression in mammalian host cells.

[0186] Human artificial chromosomes (HACs) may also be employed todeliver larger fragments of DNA than can be contained and expressed in aplasmid. HACs of 6 to 10 M are constructed and delivered viaconventional delivery methods (liposomes, polycationic amino polymers,or vesicles) for therapeutic purposes.

[0187] Specific initiation signals may also be used to achieve moreefficient translation of sequences encoding HRM. Such signals includethe ATG initiation codon and adjacent sequences. In cases wheresequences encoding HRM, its initiation codon, and upstream sequences areinserted into the expression vector, no additional transcriptional ortranslational control signals may be needed. However, in cases whereonly coding sequence, or a fragment thereof, is inserted, exogenoustranslational control signals including the ATG initiation codon shouldbe provided. Furthermore, the initiation codon should be in the correctreading frame to ensure translation of the entire insert. Exogenoustranslational elements and initiation codons may be of various origins,both natural and synthetic. The efficiency of expression may be enhancedby the inclusion of enhancers which are for the particular cell systemwhich is used, such as those described in the literature (Scharf et al.(1994) Results Probl Cell Differ 20:125-162).

[0188] In addition, a host cell strain may be chosen for its ability tomodulate the expression of the inserted sequences or to process theexpressed protein in the desired fashion. Such modifications of theprotein include, but are not limited to, acetylation, carboxylation,glycosylation, phosphorylation, lipidation, and acylation.Post-translational processing which cleaves a “prepro” form of theprotein may also be used to facilitate correct insertion, folding and/orfunction. Different host cells which have specific cellular machineryand characteristic mechanisms for post-translational activities (e.g.,CHO, HeLa, MDCK, HEK293, and WI38), are available from the ATCC(Manassas VA) and may be chosen to ensure the correct modification andprocessing of the foreign protein.

[0189] For long-term, high-yield production of recombinant proteins,stable expression is preferred. For example, cell lines which stablyexpress HRM may be transformed using expression vectors which maycontain viral origins of replication and/or endogenous expressionelements and a selectable marker gene on the same or on a separatevector. Following the introduction of the vector, cells may be allowedto grow for 1-2 days in an enriched media before they are switched toselective media. The purpose of the selectable marker is to conferresistance to selection, and its presence allows growth and recovery ofcells which successfully express the introduced sequences. Resistantclones of stably transformed cells may be proliferated using tissueculture techniques to the cell type.

[0190] Any number of selection systems may be used to recovertransformed cell lines. These include, but are not limited to, theherpes simplex virus thymidine kinase (Wigler et al. (1977) Cell11:223-32) and adenine phosphoribosyltransferase (Lowy et al. (1980)Cell 22:817-23) genes which can be employed in tk- or aprt-cells,respectively. Also, antimetabolite, antibiotic or herbicide resistancecan be used as the basis for selection; for example, dhfr which confersresistance to methotrexate (Wigler et al. (1980) Proc Natl Acad Sci77:3567-70); npt, which confers resistance to the aminoglycosides,neomycin and G-418 (Colbere-Garapin et al (1981) J Mol Biol 150:1-14);and als or pat, which confer resistance to chlorsulfuron andphosphinotricin acetyltransferase, respectively (Murry, supra).Additional selectable genes have been described, for example, trpB,which allows cells to utilize indole in place of tryptophan, or hisD,which allows cells to utilize histinol in place of histidine (Hartmanand Mulligan (1988) Proc Natl Acad Sci 85:8047-51). Recently, the use ofvisible markers has gained popularity with such markers as anthocyanins,β glucuronidase and its substrate GUS, and luciferase and its substrateluciferin, being widely used not only to identify transformants, butalso to quantify the amount of transient or stable protein expressionattributable to a specific vector system (Rhodes et al. (1995) MethodsMol Biol 55:121-131).

[0191] Although the presence/absence of marker gene expression suggeststhat the gene of interest is also present, its presence and expressionmay need to be confirmed. For example, if the sequence encoding HRM isinserted within a marker gene sequence, transformed cells containingsequences encoding HRM can be identified by the absence of marker genefunction. Alternatively, a marker gene can be placed in tandem with asequence encoding HRM under the control of a single promoter. Expressionof the marker gene in response to induction or selection usuallyindicates expression of the tandem gene as well.

[0192] Alternatively, host cells which contain the nucleic acid sequenceencoding HRM and express HRM may be identified by a variety ofprocedures known to those of skill in the art. These procedures include,but are not limited to, DNA-DNA or DNA-RNA hybridizations and proteinbioassay or immunoassay techniques which include membrane, solution, orchip based technologies for the detection and/or quantification ofnucleic acid or protein.

[0193] The presence of polynucleotides encoding HRM can be detected byDNA-DNA or DNA-RNA hybridization or PCR amplification. Nucleic acidamplification based assays involve the use of oligonucleotides based onthe polynucleotides encoding HRM to detect transformants containing DNAor RNA encoding HRM.

[0194] A variety of protocols for detecting and measuring the expressionof HRM, using either polyclonal or monoclonal antibodies specific forthe protein are known in the art. Examples include enzyme-linkedimmunosorbent assay (ELISA), radioimmunoassay (RIA), and fluorescenceactivated cell sorting (FACS). A two-site, monoclonal-based immunoassayutilizing monoclonal antibodies reactive to two non-interfering epitopeson HRM is preferred, but a competitive binding assay may be employed.These and other assays are described, among other places, in Hampton etal. (1990; Serological Methods, a Laboratory Manual, APS Press, St PaulMinn.) and Maddox et al. (1983; J Exp Med 158:1211-1216).

[0195] A wide variety of labels and conjugation techniques are known bythose skilled in the art and may be used in various nucleic acid andamino acid assays. Means for producing labeled hybridization or PCRprobes for detecting sequences related to polynucleotides encoding HRMinclude oligolabeling, nick translation, end-labeling or PCRamplification using a labeled nucleotide. Alternatively, the sequencesencoding HRM, or any fragments thereof may be cloned into a vector forthe production of an mRNA probe. Such vectors are known in the art, arecommercially available, and may be used to synthesize RNA probes invitro by addition of an RNA polymerase such as T7, T3, or SP6 andlabeled nucleotides. These procedures may be conducted using a varietyof commercially available kits (APB; Promega, Madison WI).

[0196] Host cells transformed with polynucleotides encoding HRM may becultured under conditions for the expression and recovery of the proteinfrom cell culture. The protein produced by a transformed cell may besecreted or contained intracellularly depending on the sequence and/orthe vector used. As will be understood by those of skill in the art,expression vectors containing polynucleotides which encode HRM may bedesigned to contain signal sequences which direct secretion of HRMthrough a prokaryotic or eukaryotic cell membrane. Other constructionsmay be used to join sequences encoding HRM to polynucleotide encoding aprotein domain which will facilitate purification of soluble proteins.Such purification facilitating domains include, but are not limited to,metal chelating peptides such as histidine-tryptophan modules that allowpurification on immobilized metals, protein A domains that allowpurification on immobilized immunoglobulin, and the domain utilized inthe FLAGS extension/affinity purification system (Immunex, Seattle WA).The inclusion of cleavable linker sequences such as those specific forFactor XA or enterokinase (Invitrogen, San Diego CA) between thepurification domain and HRM may be used to facilitate purification. Onesuch expression vector provides for expression of a fusion proteincontaining HRM and a nucleic acid encoding 6 histidine residuespreceding a thioredoxin or an enterokinase cleavage site. The histidineresidues facilitate purification on IMAC (immobilized metal ion affinitychromatography) as described in Porath et al. (1992, Prot Exp Purif3:263-281) while the enterokinase cleavage site provides a means forpurifying HRM from the fusion protein. A discussion of vectors whichcontain fusion proteins is provided in Kroll et al. (1993; DNA Cell Biol12:441-453).

[0197] In addition to recombinant production, portions of HRM may beproduced by direct peptide synthesis using solid-phase techniques(Merrifield (1963) J Am Chem Soc 85:2149-2154). Protein synthesis may beperformed using manual techniques or by automation. Automated synthesismay be achieved, for example, using 431A Peptide synthesizer (ABI).Various portions of HRM may be chemically synthesized separately andcombined using chemical methods to produce the full length molecule.

[0198] Therapeutics

[0199] Chemical and structural homology exits among the human regulatoryproteins of the invention. The expression of HRM is closely associatedwith cell proliferation. Therefore, in cancers or immune disorders whereHRM is an activator, transcription factor, or enhancer, and is promotingcell proliferation; it is desirable to decrease the expression of HRM.In cancers where HRM is an inhibitor or suppressor and is controlling ordecreasing cell proliferation, it is desirable to provide the protein orto increase the expression of HRM.

[0200] In one embodiment, where HRM is an inhibitor, HRM or a portion orderivative thereof may be administered to a subject to treat a cancersuch as adenocarcinoma, leukemia, lymphoma, melanoma, myeloma, sarcoma,and teratocarcinoma. Such cancers include, but are not limited to,cancers of the adrenal gland, bladder, bone, bone marrow, brain, breast,cervix, gall bladder, ganglia, gastrointestinal tract, heart, kidney,liver, lung, muscle, ovary, pancreas, parathyroid, penis, prostate,salivary glands, skin, spleen, testis, thymus, thyroid, and uterus.

[0201] In another embodiment, an agonist which is specific for HRM maybe administered to a subject to treat a cancer including, but notlimited to, those cancers listed above.

[0202] In another further embodiment, a vector capable of expressingHRM, or a portion or a derivative thereof, may be administered to asubject to treat a cancer including, but not limited to, those cancerslisted above.

[0203] In a further embodiment where HRM is promoting cellproliferation, antagonists which decrease the expression or activity ofHRM may be administered to a subject to treat a cancer such asadenocarcinoma, leukemia, lymphoma, melanoma, myeloma, sarcoma, andteratocarcinoma. Such cancers include, but are not limited to, cancersof the adrenal gland, bladder, bone, bone marrow, brain, breast, cervix,gall bladder, ganglia, gastrointestinal tract, heart, kidney, liver,lung, muscle, ovary, pancreas, parathyroid, penis, prostate, salivaryglands, skin, spleen, testis, thymus, thyroid, and uterus. In oneaspect, antibodies which specifically bind HRM may be used directly asan antagonist or indirectly as a targeting or delivery mechanism forbringing a pharmaceutical agent to cells or tissue which express HRM.

[0204] In another embodiment, a vector expressing the complement of thepolynucleotide encoding HRM may be administered to a subject to treat acancer including, but not limited to, those cancers listed above.

[0205] In yet another embodiment where HRM is promoting leukocyteactivity or proliferation, antagonists which decrease the activity ofHRM may be administered to a subject to treat an immune response. Suchresponses may be associated with AIDS, Addison's disease, adultrespiratory distress syndrome, allergies, anemia, asthma,atherosclerosis, bronchitis, cholecystitus, Crohn's disease, ulcerativecolitis, atopic dermatitis, dermatomyositis, diabetes mellitus,emphysema, atrophic gastritis, glomerulonephritis, gout, Graves'disease, hypereosinophilia, irritable bowel syndrome, lupuserythematosus, multiple sclerosis, myasthenia gravis, myocardial orpericardial inflammation, osteoarthritis, osteoporosis, pancreatitis,polymyositis, rheumatoid arthritis, scleroderma, Sjbgren's syndrome, andautoimmune thyroiditis, complications of cancer, hemodialysis,extracorporeal circulation, viral, bacterial, fungal, parasitic,protozoal, and helminthic infections, and trauma. In one aspect,antibodies which specifically bind HRM may be used directly as anantagonist or indirectly as a targeting or delivery mechanism forbringing a pharmaceutical agent to cells or tissue which express HRM.

[0206] In another embodiment, a vector expressing the complement of thepolynucleotide encoding HRM may be administered to a subject to treat animmune response including, but not limited to, those listed above In onefurther embodiment, HRM or a portion or derivative thereof may be addedto cells to stimulate cell proliferation. In particular, HRM may beadded to a cell in culture or cells in vivo using delivery mechanismssuch as liposomes, viral based vectors, or electroinjection for thepurpose of promoting cell proliferation and tissue or organregeneration. Specifically, HRM may be added to a cell, cell line,tissue or organ culture in vitro or ex vivo to stimulate cellproliferation for use in heterologous or autologous transplantation. Insome cases, the cell will have been preselected for its ability to fightan infection or a cancer or to correct a genetic defect in α diseasesuch as sickle cell anemia, , thalassemia, cystic fibrosis, orHuntington's chorea.

[0207] In another embodiment, an agonist which is specific for HRM maybe administered to a cell to stimulate cell proliferation, as describedabove.

[0208] In another embodiment, a vector capable of expressing HRM, or aportion or a derivative thereof, may be administered to a cell tostimulate cell proliferation, as described above.

[0209] In other embodiments, any of the therapeutic proteins,antagonists, antibodies, agonists, complementary sequences or vectors ofthe invention may be administered in combination with other therapeuticagents. Selection of the agents for use in combination therapy may bemade by one of ordinary skill in the art, according to conventionalpharmaceutical principles. The combination of therapeutic agents may actsynergistically to effect the treatment of the various disordersdescribed above. Using this approach, one may be able to achievetherapeutic efficacy with lower dosages of each agent, thus reducing thepotential for adverse side effects.

[0210] Antagonists or inhibitors of HRM may be produced using methodswhich are generally known in the art. In particular, purified HRM may beused to produce antibodies or to screen libraries of pharmaceuticalagents to identify those which specifically bind HRM.

[0211] Antibodies to HRM may be generated using methods that are wellknown in the art. Such antibodies may include, but are not limited to,polyclonal, monoclonal, chimeric, single chain, Fab fragments, andfragments produced by a Fab expression library. Neutralizing antibodies,(i.e., those which inhibit dimer formation) are especially preferred fortherapeutic use.

[0212] For the production of antibodies, various hosts including goats,rabbits, rats, mice, humans, and others, may be immunized by injectionwith HRM or any portion or oligopeptide thereof which has immunogenicproperties. Depending on the host species, various adjuvants may be usedto increase immunological response. Such adjuvants include, but are notlimited to, Freund's, mineral gels such as aluminum hydroxide, andsurface active substances such as lysolecithin, pluronic polyols,polyanions, peptides, oil emulsions, keyhole limpet hemocyanin, anddinitrophenol. Among adjuvants used in humans, BCG (bacilliCalmette-Guerin) and Corynebacterium parvum are especially preferable.

[0213] It is preferred that the oligopeptides, peptides, or portionsused to induce antibodies to HRM have an amino acid sequence consistingof at least five amino acids and more preferably at least 10 aminoacids. It is also preferable that they are identical to a portion of theamino acid sequence of the natural protein, and they may contain theentire amino acid sequence of a small, naturally occurring molecule.Short stretches of HRM amino acids may be fused with those of anotherprotein such as keyhole limpet hemocyanin and antibody produced againstthe chimeric molecule.

[0214] Monoclonal antibodies to HRM may be prepared using any techniquewhich provides for the production of antibody molecules by continuouscell lines in culture. These include, but are not limited to, thehybridoma technique, the human B-cell hybridoma technique, and theEBV-hybridoma technique (Kohler et al. (1975) Nature 256:495-497 Kozboret al. (1985) J Immunol Methods 81:31-42 Cote et al. (1983) Pr Natl AcadSci 80:2026-2030, Cole et al. (1984) Mol Cell Biol 62:109-120).

[0215] In addition, techniques developed for the production of “chimericantibodies”, the splicing of mouse antibody genes to human antibodygenes to obtain a molecule with antigen specificity and biologicalactivity can be used (Morrison et al. (1984) Proc Natl Acad Sci81:6851-6855, Neuberger et al. (1984) Nature 312:604-608, and Takeda etal. (1985) Nature 314:452-454). Alternatively, techniques described forthe production of single chain antibodies may be adapted, using methodsknown in the art, to produce HRM-specific single chain antibodies.Antibodies with related specificity, but of distinct idiotypiccomposition, may be generated by chain shuffling from randomcombinatorial immunoglobulin libraries (Burton (1991) Proc Natl Acad Sci88:11120-3).

[0216] Antibodies may also be produced by inducing in vivo production inthe lymphocyte population or by screening immunoglobulin libraries orpanels of highly specific binding reagents as disclosed in theliterature (Orlandi et al. (1989) Proc Natl Acad Sci 86:3833-3837,Winter et al. (1991) Nature 349:293-299).

[0217] Antibody fragments which contain specific binding sites for HRMmay also be generated. For example, such fragments include, but are notlimited to, the F(ab′)2 fragments which can be produced by pepsindigestion of the antibody molecule and the Fab fragments which can begenerated by reducing the disulfide bridges of the F(ab′)2 fragments.Alternatively, Fab expression libraries may be constructed to allowrapid and easy identification of monoclonal Fab fragments with thedesired specificity (Huse et al. (1989) Science 254:1275-1281).

[0218] Various immunoassays may be used for screening to identifyantibodies having the desired specificity. Numerous protocols forcompetitive binding or immunoradiometric assays using either polyclonalor monoclonal antibodies with established specificities are well knownin the art. Such immunoassays typically involve the measurement ofcomplex formation between HRM and its specific antibody. A two-site,monoclonal-based immunoassay utilizing monoclonal antibodies reactive totwo non-interfering HRM epitopes is preferred, but a competitive bindingassay may also be employed (Maddox, supra).

[0219] In another embodiment of the invention, the polynucleotidesencoding HRM, or any fragment or complement thereof, may be used fortherapeutic purposes. In one aspect, the complement of thepolynucleotide encoding HRM may be used in situations in which it wouldbe desirable to block the transcription of the mRNA. In particular,cells may be transformed with sequences complementary to polynucleotidesencoding HRM. Thus, complementary molecules or fragments may be used tomodulate HRM activity, or to achieve regulation of gene function. Suchtechnology is now well known in the art, and sense or antisenseoligonucleotides or larger fragments, can be designed from variouslocations along the coding or control regions of sequences encoding HRM.

[0220] Expression vectors derived from retroviruses, adenovirus, herpesor vaccinia viruses, or from various bacterial plasmids may be used fordelivery of polynucleotides to the targeted organ, tissue or cellpopulation. Methods which are well known to those skilled in the art canbe used to construct vectors which will express nucleic acid sequencewhich is complementary to the polynucleotides of the gene encoding HRM.These techniques are described both in Sambrook (ura and in Ausubel(supra).

[0221] Genes encoding HRM can be turned off by transforming a cell ortissue with expression vectors which express high levels of apolynucleotide or fragment thereof which encodes HRM. Such constructsmay be used to introduce untranslatable sense or antisense sequencesinto a cell. Even in the absence of integration into the DNA, suchvectors may continue to transcribe RNA molecules until they are disabledby endogenous nucleases. Transient expression may last for a month ormore with a non-replicating vector and even longer if replicationelements are part of the vector system.

[0222] As mentioned above, modifications of gene expression can beobtained by designing complementary sequences or antisense molecules(DNA, RNA, or PNA) to the control, 5′ or regulatory regions of the geneencoding HRM (signal sequence, promoters, enhancers, and introns).Oligonucleotides derived from the transcription initiation site, e.g.,between positions −10 and +10 from the start site, are preferred.Similarly, inhibition can be achieved using “triple helix” base-pairingmethodology. Triple helix pairing is useful because it causes inhibitionof the ability of the double helix to open for the binding ofpolymerases, transcription factors, or regulatory molecules. Recenttherapeutic advances using triplex DNA have been described in theliterature (Gee et al. (1994) In: Huber and Carr, Molecular andImmunologic Approaches, Futura Publishing, Mt. Kisco N.Y.). Thecomplementary sequence or antisense molecule may also be designed toblock translation of mRNA by preventing the transcript from binding toribosomes.

[0223] Ribozymes, enzymatic RNA molecules, may also be used to catalyzethe specific cleavage of RNA. The mechanism of ribozyme action involvessequence-specific hybridization of the ribozyme molecule tocomplementary target RNA, followed by endonucleolytic cleavage. Exampleswhich may be used include engineered hammerhead motif ribozyme moleculesthat can specifically and efficiently catalyze endonucleolytic cleavageof sequences encoding HRM.

[0224] Specific ribozyme cleavage sites within any potential RNA targetare initially identified by scanning the target molecule for ribozymecleavage sites which include the following sequences: GUA, GUU, and GUC.Once identified, short RNA sequences of between 15 and 20ribonucleotides corresponding to the region of the target genecontaining the cleavage site may be evaluated for secondary structuralfeatures which may render the oligonucleotide inoperable. Thesuitability of candidate targets may also be evaluated by testingaccessibility to hybridization with complementary oligonucleotides usingribonuclease protection assays.

[0225] Complementary ribonucleic acid molecules and ribozymes of theinvention may be prepared by any method known in the art for thesynthesis of nucleic acid molecules. These include techniques forchemically synthesizing oligonucleotides such as solid phasephosphoramidite chemical synthesis. Alternatively, RNA molecules may begenerated by in vitro and in vivo transcription of DNA sequencesencoding HRM. Such DNA sequences may be incorporated into a wide varietyof vectors with RNA polymerase promoters such as T7 or SP6.Alternatively, these cDNA constructs that synthesize complementary RNAconstitutively or inducibly can be introduced into cell lines, cells, ortissues.

[0226] RNA molecules may be modified to increase intracellular stabilityand half-life. Possible modifications include, but are not limited to,the addition of flanking sequences at the 5′ and/or 3′ ends of themolecule or the use of phosphorothioate or 2′ O-methyl rather thanphosphodiesterase linkages within the backbone of the molecule. Thisconcept is inherent in the production of PNAs and can be extended in allof these molecules by the inclusion of nontraditional bases such asinosine, queosine, and wybutosine, as well as acetyl-, methyl-, thio-,and similarly modified forms of adenine, cytidine, guanine, thymine, anduridine which are not as easily recognized by endogenous endonucleases.

[0227] Many methods for introducing vectors into cells or tissues areavailable for use in vivo, in vitro, and ex vivo. For ex vivo therapy,vectors may be introduced into stem cells taken from the patient andclonally propagated for autologous transplant back into that samepatient. Delivery by transfection, by liposome injections orpolycationic amino polymers (Goldman et al. (1997) Nature Biotechnol15:462-66, incorporated herein by reference) may be achieved usingmethods which are well known in the art.

[0228] Any of the therapeutic methods described above may be applied toany subject in need of such therapy, including, for example, mammalssuch as dogs, cats, cows, horses, rabbits, monkeys, and most preferably,humans.

[0229] An additional embodiment of the invention relates to theadministration of a pharmaceutical composition, in conjunction with apharmaceutically acceptable carrier, for any of the therapeutic effectsdiscussed above. Such pharmaceutical compositions may consist of HRM,antibodies to HRM, mimetics, agonists, antagonists, or inhibitors ofHRM. The compositions may be administered alone or in combination withat least one other agent, such as stabilizing compound, which may beadministered in any sterile, biocompatible pharmaceutical carrier,including, but not limited to, saline, buffered saline, dextrose, andwater. The compositions may be administered to a patient alone, or incombination with other agents, drugs or hormones.

[0230] The pharmaceutical compositions utilized in this invention may beadministered by any number of routes including, but not limited to,oral, intravenous, intramuscular, intra-arterial, intramedullary,intrathecal, intraventricular, transdermal, subcutaneous,intraperitoneal, intranasal, enteral, topical, sublingual, or rectalmeans.

[0231] In addition to the active ingredients, these pharmaceuticalcompositions may contain pharmaceutically-acceptable carriers comprisingexcipients and auxiliaries which facilitate processing of the activecompounds into preparations which can be used pharmaceutically. Furtherdetails on techniques for formulation and administration may be found inthe latest edition of Remington's Pharmaceutical Sciences (MackPublishing, Easton PA).

[0232] Pharmaceutical compositions for oral administration can beformulated using pharmaceutically acceptable carriers well known in theart in dosages for oral administration. Such carriers enable thepharmaceutical compositions to be formulated as tablets, pills, dragees,capsules, liquids, gels, syrups, slurries, suspensions, and the like,for ingestion by the patient.

[0233] Pharmaceutical preparations for oral use can be obtained throughcombination of active compounds with solid excipient, optionallygrinding a resulting mixture, and processing the mixture of granules,after adding auxiliaries, if desired, to obtain tablets or dragee cores.Excipients include carbohydrate or protein fillers, such as sugars,including lactose, sucrose, mannitol, or sorbitol, starch from corn,wheat, rice, potato, or other plants, cellulose, such as methylcellulose, hydroxypropylmethyl-cellulose, or sodiumcarboxymethylcellulose, gums including arabic and tragacanth, andproteins such as gelatin and collagen. If desired, disintegrating orsolubilizing agents may be added, such as the cross-linked polyvinylpyrrolidone, agar, alginic acid, or a salt thereof, such as sodiumalginate.

[0234] Dragee cores may be used in conjunction with coatings, such asconcentrated sugar solutions, which may also contain gum arabic, talc,polyvinylpyrrolidone, carbopol gel, polyethylene glycol, and/or titaniumdioxide, lacquer solutions, and organic solvents or solvent mixtures.Dyestuffs or pigments may be added to the tablets or dragee coatings forproduct identification or to characterize the quantity of activecompound, i.e., dosage.

[0235] Pharmaceutical preparations which can be used orally includepush-fit capsules made of gelatin, as well as soft, sealed capsules madeof gelatin and a coating, such as glycerol or sorbitol. Push-fitcapsules can contain active ingredients mixed with a filler or binders,such as lactose or starches, lubricants, such as talc or magnesiumstearate, and, optionally, stabilizers. In soft capsules, the activecompounds may be dissolved or suspended in liquids, such as fatty oils,liquid, or liquid polyethylene glycol with or without stabilizers.

[0236] Pharmaceutical formulations for parenteral administration may beformulated in aqueous solutions, preferably in physiologicallycompatible buffers such as Hanks' solution. Ringer's solution, orphysiologically buffered saline. Aqueous injection suspensions maycontain substances which increase the viscosity of the suspension, suchas sodium carboxymethyl cellulose, sorbitol, or dextran. Additionally,suspensions of the active compounds may be prepared as oily injectionsuspensions. Lipophilic solvents or vehicles include fatty oils such assesame oil, or synthetic fatty acid esters, such as ethyl oleate ortriglycerides, or liposomes. Non-lipid polycationic amino polymers mayalso be used for delivery. Optionally, the suspension may also containstabilizers or agents which increase the solubility of the compounds toallow for the preparation of highly concentrated solutions.

[0237] For topical or nasal administration, penetrants to the particularbarrier to be permeated are used in the formulation. Such penetrants aregenerally known in the art.

[0238] The pharmaceutical compositions of the present invention may bemanufactured in a manner that is known in the art, e.g., by means ofconventional mixing, dissolving, granulating, dragee-making, levigating,emulsifying, encapsulating, entrapping, or lyophilizing processes.

[0239] The pharmaceutical composition may be provided as a salt and canbe formed with many acids, including but not limited to, hydrochloric,sulfuric, acetic, lactic, tartaric, malic, succinic, etc. Salts tend tobe more soluble in aqueous or other protonic solvents than are thecorresponding free base forms. In other cases, the preferred preparationmay be a lyophilized powder which may contain any or all of thefollowing: 1-50 mM histidine, 0.1%-2% sucrose, and 2-7% mannitol, at apH range of 4.5 to 5.5, that is com buffer prior to use.

[0240] After pharmaceutical compositions have been prepared, they can beplaced in a container and labeled for treatment of an indicatedcondition. For administration of HRM, such labeling would includeamount, frequency, and method of administration.

[0241] Pharmaceutical compositions for use in the invention includecompositions wherein the active ingredients are contained in aneffective amount to achieve the intended purpose. The determination ofan effective dose is well within the capability of those skilled in theart.

[0242] For any compound, the therapeutically effective dose can beestimated initially either in cell culture assays, e.g., of neoplasticcells, or in animal models, usually mice, rabbits, dogs, or pigs. Theanimal model may also be used to determine the concentration range androute of administration. Such information can then be used to determineuseful doses and routes for administration in humans.

[0243] A therapeutically effective dose refers to that amount of activeingredient, for example HRM or portions thereof, antibodies of HRM,agonists, antagonists or inhibitors of HRM, which ameliorates thesymptoms or condition. Therapeutic efficacy and toxicity may bedetermined by standard pharmaceutical procedures in cell cultures orexperimental animals, e.g., ED50 (the dose therapeutically effective in50% of the population) and LD50 (the dose lethal to 50% of thepopulation). The dose ratio between therapeutic and toxic effects is thetherapeutic index, and it can be expressed as the ratio, LD50/ED50.

[0244] Pharmaceutical compositions which exhibit large therapeuticindices are preferred. The data obtained from cell culture assays andanimal studies is used in formulating a range of dosage for human use.The dosage contained in such compositions is preferably within a rangeof circulating concentrations that include the ED50 with little or notoxicity. The dosage varies within this range depending upon the dosageform employed, sensitivity of the patient, and the route ofadministration.

[0245] The exact dosage will be determined by the practitioner, in lightof factors related to the subject that requires treatment. Dosage andadministration are adjusted to provide levels of the active moiety thatproduce or maintain the desired effect. Factors which may be taken intoaccount include the severity of the disease state, general health of thesubject, age, weight, and gender of the subject, diet, time andfrequency of administration, drug combination(s), reactionsensitivities, and tolerance/response to therapy. Long-actingpharmaceutical compositions may be administered every 3 to 4 days, everyweek, or once every two weeks depending on half-life and clearance rateof the particular formulation.

[0246] Normal dosage amounts may vary from 0.1 to 100,000 micrograms, upto a total dose of about 1 g, depending upon the route ofadministration. Guidance as to particular dosages and methods ofdelivery is provided in the literature and generally available topractitioners in the art. Those skilled in the art will employ differentformulations for nucleotides than for proteins or their inhibitors.Similarly, delivery of polynucleotides or proteins will be specific toparticular cells, conditions, locations, etc.

[0247] Diagnostic

[0248] In another embodiment, antibodies which specifically bind HRM maybe used for the diagnosis of conditions or diseases characterized byexpression of HRM, or in assays to monitor patients being treated withHRM, agonists, antagonists or inhibitors. The antibodies useful fordiagnostic purposes may be prepared in the same manner as thosedescribed above for therapeutics. Diagnostic assays for HRM includemethods which utilize the antibody and a label to detect HRM in humanbody fluids or extracts of cells or tissues. The antibodies may be usedwith or without modification, and may be labeled by joining them, eithercovalently or non-covalently, with a reporter molecule. A wide varietyof reporter molecules which are known in the art may be used, several ofwhich are described above.

[0249] A variety of protocols including ELISA, RIA, and FACS formeasuring HRM are known in the art and provide a basis for diagnosingaltered or abnormal levels of HRM expression. Normal or standard valuesfor HRM expression are established by combining body fluids or cellextracts taken from normal mammalian subjects, preferably human, withantibody to HRM under conditions for complex formation. The amount ofstandard complex formation may be quantified by various methods, butpreferably by photometric means. Quantities of HRM expressed in subjectsamples are compared with the standard values from control and diseasedsamples. Deviation between standard and subject values establishes theparameters for diagnosing disease.

[0250] In another embodiment of the invention, the polynucleotidesencoding HRM may be used for diagnostic purposes. The polynucleotideswhich may be used include oligonucleotide sequences, complementary RNAand DNA molecules, and PNAs. The polynucleotides may be used to detectand quantitate gene expression in biopsied tissues in which expressionof HRM may be correlated with disease. The diagnostic assay may be usedto distinguish between absence, presence, and excess expression of HRM,and to monitor regulation of HRM levels during therapeutic intervention.

[0251] In one aspect, hybridization with PCR probes which are capable ofdetecting polynucleotides, including genomic sequences, encoding HRM orclosely related molecules, may be used to identify nucleic acidsequences which encode HRM. The specificity of the probe, whether it ismade from a highly specific region, e.g., 10 unique nucleotides in the5′ regulatory region, or a less specific region, e.g., especially in the3′ coding region, and the stringency of the hybridization oramplification (maximal, high, intermediate, or low) will determinewhether the probe identifies only naturally occurring sequences encodingHRM, alleles, or related sequences.

[0252] Probes may also be used for the detection of related sequences,and should preferably contain at least 50% of the nucleotides from anyof the HRM encoding sequences. The hybridization probes of the inventionmay be DNA or RNA and derived from the polynucleotide of SEQ IDNOs:50-98 or from genomic sequence including promoter, enhancerelements, and introns of the naturally occurring HRM.

[0253] Means for producing specific hybridization probes forpolynucleotides encoding HRM include the cloning of nucleic acidsequences encoding HRM or HRM derivatives into vectors for theproduction of mRNA probes. Such vectors are known in the art,commercially available, and may be used to synthesize RNA probes invitro by means of the addition of the RNA polymerases and the labelednucleotides. Hybridization probes may be labeled by a variety ofreporter groups, for example, radionuclides such as 32P or 35S, orenzymatic labels, such as alkaline phosphatase coupled to the probe viaavidin/biotin coupling systems, and the like.

[0254] Polynucleotides encoding HRM may be used for the diagnosis ofconditions, disorders, or diseases which are associated with eitherincreased or decreased expression of HRM. Examples of such conditions ordiseases include adenocarcinoma, leukemia, lymphoma, melanoma, myeloma,sarcoma, teratocarcinoma, and cancers of the adrenal gland, bladder,bone, brain, breast, cervix, gall bladder, ganglia, gastrointestinaltract, heart, kidney, liver, lung, bone marrow, muscle, ovary, pancreas,parathyroid, penis, prostate, salivary glands, skin, spleen, testis,thymus, thyroid, and uterus, and immune disorders such as AIDS,Addison's disease, adult respiratory distress syndrome, allergies,anemia, asthma, atherosclerosis, bronchitis, cholecystitus, Crohn'sdisease, ulcerative colitis, atopic dermatitis, dermatomyositis,diabetes mellitus, emphysema, atrophic gastritis, glomerulonephritis,gout, Graves' disease, hypereosinophilia, irritable bowel syndrome,lupus erythematosus, multiple sclerosis, myasthenia gravis, myocardialor pericardial inflammation, osteoarthritis, osteoporosis, pancreatitis,polymyositis, rheumatoid arthritis, scleroderma, Sjögren's syndrome, andthyroiditis. The polynucleotides encoding HRM may be used in Southern ornorthern analysis, dot blot, or other membrane-based technologies, inPCR technologies, or in dipstick, pin, or other multiformat assaysincluding microarrays to analyze fluids or tissues from patient biopsiesto detect altered HRM expression. Such qualitative or quantitativemethods are well known in the art.

[0255] In a particular aspect, the polynucleotides encoding HRM may beuseful in assays that detect activation or induction of various cancers,particularly those mentioned above. The polynucleotides encoding HRM maybe labeled by standard methods, and added to a fluid or tissue samplefrom a patient under conditions for the formation of hybridizationcomplexes. After an incubation period, the sample is washed, the signalis quantitated and compared with a standard value. If the amount ofsignal in the biopsied or extracted sample is significantly differentfrom that of a comparable control sample, the polynucleotides havehybridized with nucleic acids in the sample, and the presence ofdifferentially expressed polynucleotides encoding HRM in the sampleindicates the presence of the disease. Such assays may also be used toevaluate the efficacy of a particular therapeutic treatment regimen inanimal studies, in clinical trials, or in monitoring the treatment of anindividual patient.

[0256] In order to provide a basis for the diagnosis of diseaseassociated with expression of HRM, a normal or standard profile forexpression is established. This may be accomplished by combining bodyfluids or cell extracts taken from normal subjects, either animal orhuman, with a sequence, or a fragment thereof, which encodes HRM, underconditions for hybridization or amplification. Standard hybridizationmay be quantified by comparing the values obtained from normal subjectswith those from an experiment where a known amount of an isolatedpolynucleotide is used. Standard values obtained from normal samples maybe compared with values obtained from samples from patients who aresymptomatic for disease. Deviation between standard and subject valuesis used to establish the presence of disease.

[0257] Once disease is established and a treatment protocol isinitiated, hybridization assays may be repeated on a regular basis toevaluate whether the level of expression in the patient begins toapproximate that which is observed in the normal patient. The resultsobtained from successive assays may be used to show the efficacy oftreatment over a period ranging from several days to months.

[0258] With respect to cancer, the presence of a relatively high amountof transcript in biopsied tissue from an individual may indicate apredisposition for the development of the disease or may provide a meansfor detecting the disease prior to the appearance of actual clinicalsymptoms. A more definitive diagnosis of this type may allow healthprofessionals to employ aggressive treatment earlier thereby preventingfurther progression of the cancer.

[0259] Additional diagnostic uses for oligonucleotides designed from thesequences encoding HRM may involve the use of PCR. Such oligomers may bechemically synthesized, generated enzymatically, or produced in vitro.Oligomers will preferably consist of two polynucleotides, one with senseorientation (5′−>3′) and another with antisense (3′<−5′), employed underoptimized conditions for identification of a specific gene or condition.The same two oligomers, nested sets of oligomers, or even a degeneratepool of oligomers may be employed under less stringent conditions fordetection and/or quantitation of closely related DNA or RNA sequences.

[0260] Methods which may also be used to quantitate the expression ofHRM include radiolabeling or biotinylating nucleotides, coamplificationof a control nucleic acid, and standard curves onto which theexperimental results are interpolated (Melby et al. (1993) J ImmunolMethods, 159:235-244, Duplaa et al. (1993) Anal Biochem 229-236). Thespeed of quantitation of multiple samples may be accelerated by runningthe assay in an multiwell format where the oligomer of interest ispresented in various dilutions and a spectrophotometric or colorimetricresponse gives rapid quantitation.

[0261] In further embodiments, oligonucleotides or longer fragmentsderived from any of the polynucleotides may be used as targets on amicroarray. The microarray can be used to monitor the expression levelof large numbers of genes simultaneously (to produce a transcriptimage), and to identify genetic variants, mutations and polymorphisms.This information may be used to determine gene function, understandingthe genetic basis of disease, diagnosing disease, and in developing andin monitoring the activities of therapeutic agents.

[0262] In one embodiment, the microarray is prepared and used accordingto the methods described in PCT application WO95/11995, Lockhart et al.(1996, Nature Biotechnol 14:1675-1680) and Schena et al. (1996, ProcNatl Acad Sci 93:10614-10619), all of which are incorporated herein intheir entirety by reference.

[0263] The microarray is preferably composed of a large number ofunique, single-stranded nucleic acid sequences, usually either syntheticantisense oligonucleotides or fragments of cDNAs, fixed to a solidsupport. The oligonucleotides are preferably about 6-60 nucleotides inlength, more preferably 15-30 nucleotides in length, and most preferablyabout 20-25 nucleotides in length. For a certain type of microarray, itmay be preferable to use oligonucleotides which are only 7-10nucleotides in length. The microarray may contain oligonucleotides whichcover the known 5′, or 3′, sequence, or contain sequentialoligonucleotides which cover the full length sequence, or uniqueoligonucleotides selected from particular areas along the length of thesequence. Polynucleotides used in the microarray may be oligonucleotidesthat are specific to a gene or genes of interest in which at least afragment of the sequence is known or that are specific to one or moreunidentified cDNAs which are common to a particular cell or tissue typeor to a normal, developmental, or disease state. In certain situationsit may be to use pairs of oligonucleotides on a microarray. The “pairs”will be identical, except for one nucleotide which is located in thecenter of the sequence. The second oligonucleotide in the pair(mismatched by one) serves as a control. The number of oligonucleotidepairs may range from 2 to one million.

[0264] In order to produce oligonucleotides to a known sequence for amicroarray, the gene of interest is examined using a computer algorithmwhich starts at the 5′ or more preferably at the 3′ end of thepolynucleotide. The algorithm identifies oligomers of defined lengththat are unique to the gene, have a GC content within a range forhybridization, and lack predicted secondary structure that may interferewith hybridization. In one aspect, the oligomers are synthesized atdesignated areas on a substrate using a light-directed chemical process.The substrate may be paper, nylon or other type of membrane, filter,chip, glass slide, or any other solid support.

[0265] In another aspect, the oligonucleotides may be synthesized on thesurface of the substrate by using a chemical coupling procedure and anink jet application apparatus, as described in PCT applicationWO95/251116 (Baldeschweiler et al.) which is incorporated herein in itsentirety by reference. In another aspect, a gridded array analogous to adot or slot blot apparatus may be used to arrange and link cDNAfragments or oligonucleotides to the surface of a substrate using avacuum system, thermal, UV, mechanical or chemical bonding procedures.In yet another aspect, an array may be produced by hand or usingavailable devices, materials, and machines (including multichannelpipetters or robotic instruments) and may contain 8, 24, 96, 384, 1536or 6144 oligonucleotides, or any other multiple from 2 to one millionwhich lends itself to the efficient use of commercially availableinstrumentation.

[0266] In order to conduct sample analysis using the microarrays,polynucleotides are extracted from a biological sample. The biologicalsamples may be obtained from any bodily fluid (blood, urine, saliva,phlegm, gastric juices, etc.), cultured cells, biopsies, or other tissuepreparations. To produce probes, the polynucleotides extracted from thesample are used to produce nucleic acid sequences which arecomplementary to the nucleic acids on the microarray. If the microarrayconsists of cDNAs, antisense RNAs (aRNA) are probes. Therefore, in oneaspect, mRNA is used to produce cDNA which, in turn and in the presenceof fluorescent nucleotides, is used to produce fragment oroligonucleotide aRNA probes. These fluorescently labeled probes areincubated with the microarray so that the probe sequences hybridize tothe cDNA oligonucleotides of the microarray. In another aspect,complementary nucleic acid sequences are used as probes and can alsoinclude polynucleotides, fragments, complementary, or antisensesequences produced using restriction enzymes, PCR technologies, andoligolabeling kits which are well known in the art.

[0267] Incubation conditions are adjusted so that hybridization occurswith precise complementary matches or with various degrees of lesscomplementarity. After removal of nonhybridized probes, a scanner isused to determine the levels and patterns of fluorescence. The scannedimages are examined to determine degree of complementarity and therelative abundance of each oligonucleotide sequence on the microarray. Adetection system may be used to measure the absence, presence, andamount of hybridization for all of the distinct sequencessimultaneously. This data may be used for large scale correlationstudies or functional analysis of the sequences, mutations, variants, orpolymorphisms among samples (Heller et al. (1997) Proc Natl Acad Sci94:2150-55).

[0268] In another embodiment of the invention, the nucleic acidsequences which encode HRM may also be used to generate hybridizationprobes which are useful for mapping the naturally occurring genomicsequence. The sequences may be mapped to a particular chromosome, to aspecific region of a chromosome or to artificial chromosomeconstructions, such as human artificial chromosomes (HACs), yeastartificial chromosomes (YACs), bacterial artificial chromosomes (BACs),bacterial PI constructions, or single chromosome cDNA libraries asreviewed in Price (1993, Blood Rev 7:127-134) and Trask (1991, TrendsGenet 7:149-154).

[0269] Fluorescent in situ hybridization (FISH as described in Verma etal. (1988) Human Chromosomes: A Manual of Basic Techniques, PergamonPress, New York N.Y.) may be correlated with other physical chromosomemapping techniques and genetic map data. Examples of genetic map datacan be found in various scientific journals or at Online MendelianInheritance in Man (OMIM). Correlation between the location of the geneencoding HRM on a physical chromosomal map and a specific disease, orpredisposition to a specific disease, may help delimit the region of DNAassociated with that genetic disease. The polynucleotides of theinvention may be used to detect differences in gene sequences betweennormal, carrier, or affected individuals.

[0270] In situ hybridization of chromosomal preparations and physicalmapping techniques such as linkage analysis using establishedchromosomal markers may be used for extending genetic maps. Often theplacement of a gene on the chromosome of another mammalian species, suchas mouse, may reveal associated markers even if the number or arm of aparticular human chromosome is not known. New sequences can be assignedto chromosomal arms, or parts thereof, by physical mapping. Thisprovides valuable information to investigators searching for diseasegenes using positional cloning or other gene discovery techniques. Oncethe disease or syndrome has been crudely localized by genetic linkage toa particular genomic region, for example, AT to 11q22-23 (Gatti et al.(1988) Nature 336:577-580), any sequences mapping to that area mayrepresent associated or regulatory genes for further investigation. Thepolynucleotide of the invention may also be used to detect differencesin the chromosomal location due to translocation, inversion, etc. amongnormal, carrier, or affected individuals.

[0271] In another embodiment of the invention, HRM, its catalytic orimmunogenic portions or oligopeptides thereof, can be used for screeninglibraries of compounds in any of a variety of drug screening techniques.The portion employed in such screening may be free in solution, affixedto a solid support, borne on a cell surface, or located intracellularly.The formation of binding complexes, between HRM and the agent beingtested, may be measured.

[0272] Another technique for drug screening which may be used providesfor high throughput screening of compounds having binding affinity tothe protein of interest as described in published PCT applicationWO84/03564. In this method, large numbers of different small testcompounds are synthesized on a solid substrate, such as plastic pins orsome other surface. The test compounds are reacted with HRM, or portionsthereof, and washed. Bound HRM is then detected by methods well known inthe art. Purified HRM can also be coated directly onto plates for use inthe aforementioned drug screening techniques. Alternatively,non-neutralizing antibodies can be used to capture the peptide andimmobilize it on a solid support.

[0273] In another embodiment, one may use competitive drug screeningassays in which neutralizing antibodies capable of binding HRMspecifically compete with a test compound for binding HRM. In thismanner, the antibodies can be used to detect the presence of any peptidewhich shares one or more antigenic determinants with HRM.

[0274] In additional embodiments, the polynucleotides which encode HRMmay be used in any molecular biology techniques that have yet to bedeveloped, provided the new techniques rely on properties ofpolynucleotides that are currently known, including, but not limited to,such properties as the triplet genetic code and specific base pairinteractions.

[0275] The examples below are provided to illustrate the invention andare not included for the purpose of limiting the invention.

EXAMPLES

[0276] For purposes of example, the preparation and sequencing of theLNODNOT03 cDNA library, from which Incyte Clones 1572888, 1573677,1574624, and 1577239 were isolated, is described. Preparation andsequencing of cDNAs in libraries in the LIFESEQ database (IncyteGenomics, Palo Alto CA) have varied over time, and the gradual changesinvolved use of kits, plasmids, and machinery available at theparticular time the library was made and analyzed.

I LNODNOT03 cDNA Library Construction

[0277] The LNODNOT03 cDNA library was constructed using 1 μg of polyARNA isolated from lymph node tissue removed from a 67-year-old Caucasianmale during a segmental lung resection and bronchoscopy. Microscopicexamination showed that the tissue was extensively necrotic with 10%viable tumor. The invasive grade 3/4 squamous cell carcinoma had formeda mass in the right lower lobe of the lung which had invaded into, butnot through, the visceral pleura. Focally, the tumor had obliterated thebronchial lumen although the bronchial margin was negative fordysplasia/neoplasm. One of two intrapulmonary, one of four inferiormediastinal (subcarinal), and two of eight superior mediastinal lymphnodes were metastatically involved. Patient history included hemangiomaand tobacco use, the patient was taking Doxycycline, a tetracycline, totreat an infection.

[0278] The frozen tissue was homogenized and lysed in guanidiniumisothiocyanate solution using a POLYTRON homogenizer (BrinkmannInstruments, Westbury N.Y.). The lysate was centrifuged over a 5.7 MCsCl cushion using an SW28 rotor in a L8-70M ultracentrifuge (BeckmanCoulter, Fullerton Calif.) for 18 hours at 25,000 rpm at ambienttemperature. The RNA was extracted with acid phenol, pH 4.7,precipitated using 0.3 M sodium acetate and 2.5 volumes of ethanol,resuspended in RNAse-free water, and treated with DNAse at 37C.Extraction and precipitation were repeated as before. The MRNA wasisolated using the OLIGOTEX kit (Qiagen, Chatsworth Calif.) and used toconstruct the cDNA library.

[0279] The mRNA was handled according to the recommended protocols inthe SUPERSCRIPT plasmid system (Life Technologies). The cDNAs werefractionated on a SEPHAROSE CL4B column (APB), and those cDNAs exceeding400 bp were ligated into pINCY plasmid (Incyte Genomics). The plasmidwas subsequently transformed into DH5α competent cells (LifeTechnologies).

II Isolation and Sequencing of cDNA Clones

[0280] Plasmid DNA was released from the cells and purified using theREAL Prep 96 plasmid kit (Qiagen). This kit enabled the simultaneouspurification of 96 samples in a 96-well block using multi-channelreagent dispensers. The recommended protocol was employed except for thefollowing changes: 1) the bacteria were cultured in 1 ml of sterileTERRIFIC BROTH (BD Biosciences, Sparks MD) with carbenicillin at 25 mg/Land glycerol at 0.4%, 2) after incubation for 19 hours, the cells werelysed with 0.3 ml of lysis buffer and precipitated using isopropanol,and 3) the plasmid pellet was resuspended in 0.1 ml of distilled water.After the last step in the protocol, samples were transferred to a96-well block for storage at 4 C.

[0281] The cDNAs were prepared using a MICROLAB system (Hamilton) incombination with DNA ENGINE thermal cyclers (MJ Research), sequenced bythe method of Sanger and Coulson (1975, J Mol Biol 94:441f) using 377PRISM DNA sequencing systems (ABI), and reading frame was determined.

III Homology Searching of cDNA Clones and Their Deduced Proteins

[0282] The polynucleotides and/or amino acid sequences of the SequenceListing were used to query sequences in the GenBank, SwissProt, BLOCKS,and Pima II databases. These databases, which contain previouslyidentified and annotated sequences, were searched for regions ofhomology using BLAST, which stands for Basic Local Alignment Search Tool(Altschul (1993) J Mol Evol 36:290-300, Altschul et al. (1990) J MolBiol 215:403-410).

[0283] BLAST produced alignments of both nucleotide and amino acidsequences to determine sequence similarity. Because of the local natureof the alignments, BLAST was especially useful in determining exactmatches or in identifying homologs which may be of prokaryotic(bacterial) or eukaryotic (animal, fungal, or plant) origin. Otheralgorithms such as the one described in Smith et al. (1992, ProteinEngineering 5:35-51) could have been used when dealing with primarysequence patterns and secondary structure gap penalties. The sequencesdisclosed in this application have lengths of at least 49 nucleotides,and no more than 12% uncalled bases (where N is recorded rather than A,C, G, or T).

[0284] The BLAST approach searched for matches between a query sequenceand a database sequence. BLAST evaluated the statistical significance ofany matches found and reported only those matches that satisfy theuser-selected threshold of significance. In this application, thresholdwas set at 10⁻²⁵ for nucleotides and 10⁻¹⁴ for peptides.

[0285] Incyte polynucleotides were searched against the GenBankdatabases for primate (pri), rodent (rod), and other mammalian sequences(mam), and deduced amino acid sequences from the same clones were thensearched against GenBank functional protein databases, mammalian (mamp),vertebrate (vrtp), and eukaryote (eukp) for homology.

IV Northern Analysis

[0286] Northern analysis is a laboratory technique used to detect thepresence of a transcript of a gene and involves the hybridization of alabeled polynucleotide to a membrane on which RNAs from a particularcell type or tissue have been bound (Sambrook, supra).

[0287] Analogous computer techniques use BLAST to search for identicalor related molecules in nucleotide databases such as GenBank or theLIFESEQ database (Incyte Genomics). This analysis is much faster thanmultiple, membrane-based hybridizations. In addition, the sensitivity ofthe computer search can be modified to determine whether any particularmatch is categorized as exact or homologous.

[0288] The basis of the search is the product score which is defined as:

% sequence identitv ×% maximum BLAST score/100

[0289] The product score takes into account both the degree ofsimilarity between two sequences and the length of the sequence match.For example, with a product score of 40, the match will be exact withina 1-2% error, and at 70, the match will be exact. Homologous moleculesare usually identified by selecting those which show product scoresbetween 15 and 40, although lower scores may identify related molecules.

[0290] The results of northern analysis are reported as a list oflibraries in which the transcript encoding HRM occurs. Abundance andpercent abundance are also reported. Abundance directly reflects thenumber of times a particular transcript is represented in a cDNAlibrary, and percent abundance is abundance divided by the total numberof sequences examined in the cDNA library.

V Extension of HRM Encoding Polynucleotides

[0291] The nucleic acid sequence of an Incyte Clone disclosed in theSequence Listing was used to design oligonucleotide primers forextending a partial sequence to full length. One primer was synthesizedto initiate extension in the antisense direction, and the other wassynthesized to extend sequence in the sense direction. Primers were usedto facilitate the extension of the known sequence “outward” generatingamplicons containing new, unknown nucleotide sequence for the region ofinterest. The initial primers were designed from the cDNA using OLIGOsoftware (Molecular Insights), or another program, to be about 22 toabout 30 nucleotides in length, to have a GC content of 50% or more, andto anneal to the target sequence at temperatures of about 68 to about 72C. Any stretch of nucleotides which would result in hairpin structuresand primer-primer dimerizations was avoided.

[0292] Selected human cDNA libraries (Life Technologies) were used toextend the sequence. If more than one extension is necessary or desired,additional sets of primers are designed to further extend the knownregion.

[0293] High fidelity amplification was obtained by following theinstructions for the XL-PCR kit (ABI) and thoroughly mixing the enzymeand reaction mix. Beginning with 40 pmol of each primer and therecommended concentrations of all other components of the kit, PCR wasperformed using the DNA ENGINE thermal cycler (MJ Research) and thefollowing parameters: Step 1, 94C for 1 min (initial denaturation); Step2, 65C for 1 min; Step 3, 68C for 6 min; Step 4, 94C for 15 sec; Step 5,65C for 1 min; Step 6, 68C for 7 min; Step 7, repeat step 4-6 for 15additional cycles; Step 8, 94C for 15 sec 1 min; Step 10, 68C for 7:15min; Step 11, repeat step 8-10 for 12 cycles; Step 12, 72C for 13, holdat 4 C.

[0294] A 5-10 μl aliquot of the reaction mixture was analyzed byelectrophoresis on a low concentration (about 0.6-0.8%) agarose mini-gelto determine which reactions were successful in extending the sequence.Bands thought to contain the largest products were excised from the gel,purified using QIAQUICK kit (Qiagen), and trimmed of overhangs usingKlenow enzyme to facilitate religation and cloning.

[0295] After ethanol precipitation, the products were redissolved in 13μl of ligation buffer, 1μ iul T4-DNA ligase (15 units) and 1μl T4polynucleotide kinase were added, and the mixture was incubated at roomtemperature for 2-3 hours or overnight at 16 C. Competent E. coli cells(in 40 μl of media) were transformed with 3 μl of ligation mixture andcultured in 80 μl of SOC medium (Sambrook, supra). After incubation forone hour at 37 C, the E coli mixture was plated on Luria Bertani(LB)-agar (Sambrook, supra) containing 2x carbenicillin (Carb). Thefollowing day, several colonies were randomly picked from each plate andcultured in 150 μl of liquid LB/2= Carb medium placed in an individualwell of a commercially-available, sterile 96-well microtiter plate. Thefollowing day, 5 ,μl of each overnight culture was transferred into anon-sterile 96-well plate and after dilution 1:10 with water, 5 μl ofeach sample was transferred into a PCR array.

[0296] For PCR amplification, 18 ,μl of concentrated PCR reaction mix(3.3x) containing 4 units of rTth DNA polymerase, a vector primer, andone or both of the gene specific primers used for the extension reactionwere added to each well. Amplification was performed using the followingconditions: Step 1, 94 C for 60 sec; Step 2, 94 C for 20 sec; Step 3, 55C for 30 sec; Step 4, 72 C for 90 sec; Step 5, repeat steps 2-4 for anadditional 29 cycles, Step 6, 72 C for 180 sec, and Step 7, hold at 4 C.

[0297] Aliquots of the PCR reactions were run on agarose gels togetherwith molecular weight markers. The sizes of the PCR products werecompared to the original partial cDNAs, and clones were selected,ligated into plasmid, and sequenced.

[0298] In like manner, a genomic library and a polynucleotide selectedfrom SEQ ID NOs:50-98 is used to obtain 5′ regulatory sequences usingthe procedure above.

VI Labeling and Use of Individual Hybridization Probes

[0299] Hybridization probes derived from SEQ ID NOs:50-98 are employedto screen cDNAs, genomic DNAs, or mRNAs. Although the labeling ofoligonucleotides, consisting of about 20 base-pairs, is specificallydescribed, the same procedure is used with larger nucleotide fragments.Oligonucleotides are designed using state-of-the-art software such asOLIGO software (Molecular Insights), labeled by combining 50 pmol ofeach oligomer and 250 μCi of [γ-³²P] adenosine triphosphate (APB) and T4polynucleotide kinase (NEN Life Science Products, Acton Mass.). Thelabeled oligonucleotides are purified using SEPHADEX G-25 superfineresin column (APB). A aliquot containing 10⁷ counts per minute of thelabeled probe is used in a typical membrane-based hybridization analysisof human genomic DNA digested with one of the following endonucleases(Ase I, Bgl II, Eco RI, Pst I, Xba 1, or Pvu II, NEN Life ScienceProducts).

[0300] The DNA from each digest is fractionated on a 0.7 percent agarosegel and transferred to NYTRANPLUS membranes (Schleicher & Schuell,Durham NH). Hybridization is carried out for 16 hours at 40 C. To removenonspecific signals, blots are sequentially washed at room temperatureunder increasingly stringent conditions up to 0.1 × saline sodiumcitrate and 0.5% sodium dodecyl sulfate. After XOMAT AR film (EastmanKodak, Rochester N.Y.) is exposed to the blots in a PHOSPHOIMAGERcassette (APB) for several hours, hybridization patterns are comparedvisually.

VII Microarrays

[0301] To produce oligonucleotides for a microarray, SEQ ID NOs:50-98are examined using a computer algorithm which starts at the 3′ end ofthe polynucleotide. The algorithm identified oligomers of defined lengththat are unique to the gene, have a GC content within a range forhybridization, and lack predicted secondary structure that wouldinterfere with hybridization. The algorithm identifies approximately 20sequence-specific oligonucleotides of 20 nucleotides in length(20-mers). A matched set of oligonucleotides are created in which onenucleotide in the center of each sequence is altered. This process isrepeated for each gene in the microarray, and double sets of twenty 20mers are synthesized and arranged on the surface of the silicon chipusing a light-directed chemical process (described in PCT/WO95/11995).

[0302] In the alternative, a chemical coupling procedure and an ink jetdevice are used to synthesize oligomers on the surface of a substrate(PCT/WO95/251116). In another alternative, a gridded array is used toarrange and link cDNA fragments or oligonucleotides to the surface of asubstrate using a vacuum system, thermal, UV, mechanical, or chemicalbonding procedures. A typical array may be produced by hand or usingavailable materials and machines and contain grids of 8 dots, 24 dots,96 dots, 384 dots, 1536 dots or 6144 dots. After hybridization, themicroarray is washed to remove nonhybridized probes, and a scanner isused to determine the levels and patterns of fluorescence. The scannedimage is examined to determine degree of complementarity and therelative abundance/expression level of each sequence in the microarray.

VIII Complementary Polynucleotides

[0303] Sequence complementary to the sequence encoding HRM, or any partthereof, is used to detect, decrease or inhibit expression of naturallyoccurring HRM. Although use of oligonucleotides comprising from about 15to about 30 base-pairs is described, the same procedure is used withsmaller or larger sequence fragments. Oligonucleotides are designedusing OLIGO software (Molecular Insights) and the coding sequence of SEQID NOs:50-98. To inhibit transcription, a complementary oligonucleotideis designed from the most unique 5′ sequence and used to preventpromoter binding to the coding sequence. To inhibit translation, acomplementary oligonucleotide is designed to prevent ribosomal bindingto the transcript encoding HRM.

IX Expression of HRM

[0304] Expression of HRM is accomplished by subcloning the cDNAs intovectors and transforming the vectors into host cells. In this case, thecloning vector is also used to express HRM in E. coli. Upstream of thecloning site, this vector contains a promoter for β-galactosidase,followed by sequence containing the amino-terminal Met, and thesubsequent seven residues of β-galactosidase. Immediately followingthese eight residues is a bacteriophage promoter useful fortranscription and a linker containing a number of unique restrictionsites.

[0305] Induction of an isolated, transformed bacterial strain with IPTGusing standard methods produces a fusion protein which consists of thefirst eight residues of β-galactosidase, about 5 to 15 residues oflinker, and the full length protein. The signal residues direct thesecretion of HRM into the bacterial growth media which can be useddirectly in the following assay for activity.

X Demonstration of HRM Activity

[0306] HRM can be expressed in a mammalian cell line such as DLD-1 orHCT116 (ATCC) by transforming the cells with a eukaryotic expressionvector encoding HRM. Eukaryotic expression vectors are commerciallyavailable and the techniques to introduce them into cells are well knownto those skilled in the art. The effect of HRM on cell morphology may bevisualized by microscopy, the effect on cell growth may be determined bymeasuring cell doubling-time, and the effect on tumorigenicity may beassessed by the ability of transformed cells to grow in a soft agargrowth assay (Groden (1995) Cancer Res 55:1531-1539).

XI Production of HRM Specific Antibodies

[0307] HRM that is purified using PAGE electrophoresis (Sambrook,supra), or other purification techniques, is used to immunize rabbitsand to produce antibodies using standard protocols. In the alternative,an amino acid sequence deduced from SEQ ID NOs:50-98 is analyzed usingLASERGENE software (DNASTAR, Madison Wis.) to determine regions of highimmunogenicity, and an oligopeptide is synthesized and used to raiseantibodies by means known to those of skill in the art. Selection ofepitope, such as those near the C-terminus or in hydrophilic regions, isdescribed by Ausubel (supra).

[0308] Typically, the oligopeptides are 15 residues in length,synthesized using a 43 1A Peptide synthesizer (ABI) usingFmoc-chemistry, and coupled to keyhole limpet hemocyanin (Sigma-Aldrich,St. Louis Mo.) by reaction with N-maleimidobenzoyl-N-hydroxysuccinimideester (Ausubel supra). Rabbits are immunized with the oligopeptide-KLHcomplex in complete Freund's adjuvant. The resulting antisera are testedfor antipeptide activity, for example, by binding the protein to asubstrate, blocking with 1% BSA, reacting with rabbit antisera, washing,and reacting with radio iodinated, goat anti-rabbit IgG.

XII Purification of Naturally Occurring HRM Using Specific Antibodies

[0309] Naturally occurring or recombinant HRM is substantially purifiedby immunoaffinity chromatography using antibodies specific for HRM. Animmunoaffinity column is constructed by covalently coupling HRM antibodyto an activated chromatographic resin, such as CNBr-activated SEPHAROSEresin (APB). After the coupling, the resin is blocked and washedaccording to the manufacturer's instructions.

[0310] Media containing HRM is passed over the immunoaffinity column,and the column is washed under conditions that allow the preferentialabsorbance of HRM (e.g., high ionic strength buffers in the presence ofdetergent). The column is eluted under conditions that disruptantibody/protein binding (eg, a buffer of pH 2-3 or a high concentrationof a chaotrope, such as urea or thiocyanate ion), and HRM is collected.

XIII Identification of Molecules Which Interact with HRM

[0311] HRM, or biologically active portions thereof, are labeled with¹²⁵I Bolton-Hunter reagent (Bolton et al. (1973) Biochem J 133:529-39).Candidate molecules previously arrayed in the wells of a multi-wellplate are incubated with the labeled HRM, washed and any wells withlabeled HRM complex are assayed. Data obtained using differentconcentrations of HRM are used to calculate values for the number,affinity, and association of HRM with the candidate molecules.

[0312] All publications and patents mentioned in the above specificationare herein incorporated by reference. Various modifications andvariations of the described method and system of the invention will beapparent to those skilled in the art without departing from the scopeand spirit of the invention. Although the invention has been describedin connection with specific preferred embodiments, it should beunderstood that the invention as claimed should not be unduly limited tosuch specific embodiments. Indeed, various modifications of thedescribed modes for carrying out the invention which are obvious tothose skilled in molecular biology or related fields are intended to bewithin the scope of the following claims.

1 98 151 amino acids amino acid single linear U937NOT01 133 1 Met ThrAsn Glu Glu Pro Leu Pro Lys Lys Val Arg Leu Ser Glu 5 10 15 Thr Asp PheLys Val Met Ala Arg Asp Glu Leu Ile Leu Arg Trp 20 25 30 Lys Gln Tyr GluAla Tyr Val Gln Ala Leu Glu Gly Lys Tyr Thr 35 40 45 Asp Leu Asn Ser AsnAsp Val Thr Gly Leu Arg Glu Ser Glu Glu 50 55 60 Lys Leu Lys Gln Gln GlnGln Glu Ser Ala Arg Arg Glu Asn Ile 65 70 75 Leu Val Met Arg Leu Ala ThrLys Glu Gln Glu Met Gln Glu Cys 80 85 90 Thr Thr Gln Ile Gln Tyr Leu LysGln Val Gln Gln Pro Ser Val 95 100 105 Ala Gln Leu Arg Ser Thr Met ValAsp Pro Ala Ile Asn Leu Phe 110 115 120 Phe Leu Lys Met Lys Gly Glu LeuGlu Gln Thr Lys Asp Lys Leu 125 130 135 Glu Gln Ala Gln Asn Glu Leu SerAla Trp Lys Phe Thr Pro Asp 140 145 150 Arg 185 amino acids amino acidsingle linear U937NOT01 1762 2 Met Leu Thr Leu Ala Ser Lys Leu Lys ArgAsp Asp Gly Leu Lys 5 10 15 Gly Ser Arg Thr Ala Ala Thr Ala Ser Asp SerThr Arg Arg Val 20 25 30 Ser Val Arg Asp Lys Leu Leu Val Lys Glu Val AlaGlu Leu Glu 35 40 45 Ala Asn Leu Pro Cys Thr Cys Lys Val His Phe Pro AspPro Asn 50 55 60 Lys Leu His Cys Phe Gln Leu Thr Val Thr Pro Asp Glu GlyTyr 65 70 75 Tyr Gln Gly Gly Lys Phe Gln Phe Glu Thr Glu Val Pro Asp Ala80 85 90 Tyr Asn Met Val Pro Pro Lys Val Lys Cys Leu Thr Lys Ile Trp 95100 105 His Pro Asn Ile Thr Glu Thr Gly Glu Ile Cys Leu Ser Leu Leu 110115 120 Arg Glu His Ser Ile Asp Gly Thr Gly Trp Ala Pro Thr Arg Thr 125130 135 Leu Lys Asp Val Val Trp Gly Leu Asn Ser Leu Phe Thr Asp Leu 140145 150 Leu Asn Phe Asp Asp Pro Leu Asn Ile Glu Ala Ala Glu His His 155160 165 Leu Arg Asp Lys Glu Asp Phe Arg Asn Lys Val Asp Asp Tyr Ile 170175 180 Lys Arg Tyr Ala Arg 185 59 amino acids amino acid single linearU937NOT01 1847 3 Met Gly Lys Val Asn Val Ala Lys Leu Arg Tyr Met Ser ArgAsp 5 10 15 Asp Phe Arg Val Leu Thr Ala Val Glu Met Gly Met Lys Asn His20 25 30 Glu Ile Val Pro Gly Ser Leu Ile Ala Ser Ile Ala Ser Leu Lys 3540 45 His Gly Gly Cys Asn Lys Val Leu Arg Glu Leu Val Lys His 50 55 338amino acids amino acid single linear HMC1NOT01 9337 4 Met Leu Glu ThrPhe Gly His Leu Val Ser Val Gly Trp Glu Thr 5 10 15 Thr Leu Glu Asn LysGlu Leu Ala Pro Asn Ser Asp Ile Pro Glu 20 25 30 Glu Glu Pro Ala Pro SerLeu Lys Val Gln Glu Ser Ser Arg Asp 35 40 45 Cys Ala Leu Ser Ser Thr LeuGlu Asp Thr Leu Gln Gly Gly Val 50 55 60 Gln Glu Val Gln Asp Thr Val LeuLys Gln Met Glu Ser Ala Gln 65 70 75 Glu Lys Asp Leu Pro Gln Lys Lys HisPhe Asp Asn Arg Glu Ser 80 85 90 Gln Ala Asn Ser Gly Ala Leu Asp Thr AsnGln Val Ser Leu Gln 95 100 105 Lys Ile Asp Asn Pro Glu Ser Gln Ala AsnSer Gly Ala Leu Asp 110 115 120 Thr Asn Gln Val Leu Leu His Lys Ile ProPro Arg Lys Arg Leu 125 130 135 Arg Lys Arg Asp Ser Gln Val Lys Ser MetLys His Asn Ser Arg 140 145 150 Val Lys Ile His Gln Lys Ser Cys Glu ArgGln Lys Ala Lys Glu 155 160 165 Gly Asn Gly Cys Arg Lys Thr Phe Ser ArgSer Thr Lys Gln Ile 170 175 180 Thr Phe Ile Arg Ile His Lys Gly Ser GlnVal Cys Arg Cys Ser 185 190 195 Glu Cys Gly Lys Ile Phe Arg Asn Pro ArgTyr Phe Ser Val His 200 205 210 Lys Lys Ile His Thr Gly Glu Arg Pro TyrVal Cys Gln Asp Cys 215 220 225 Gly Lys Gly Phe Val Gln Ser Ser Ser LeuThr Gln His Gln Arg 230 235 240 Val His Ser Gly Glu Arg Pro Phe Glu CysGln Glu Cys Gly Arg 245 250 255 Thr Phe Asn Asp Arg Ser Ala Ile Ser GlnHis Leu Arg Thr His 260 265 270 Thr Gly Ala Lys Pro Tyr Lys Cys Gln AspCys Gly Lys Ala Phe 275 280 285 Arg Gln Ser Ser His Leu Ile Arg His GlnArg Thr His Thr Gly 290 295 300 Glu Arg Pro Tyr Ala Cys Asn Lys Cys GlyLys Ala Phe Thr Gln 305 310 315 Ser Ser His Leu Ile Gly His Gln Arg ThrHis Asn Arg Thr Lys 320 325 330 Arg Lys Lys Lys Gln Pro Thr Ser 335 456amino acids amino acid single linear HMC1NOT01 9476 5 Met Lys Ile GluGlu Val Lys Ser Thr Thr Lys Thr Gln Arg Ile 5 10 15 Ala Ser His Ser HisVal Lys Gly Leu Gly Leu Asp Glu Ser Gly 20 25 30 Leu Ala Lys Gln Ala AlaSer Gly Leu Val Gly Gln Glu Asn Ala 35 40 45 Arg Glu Ala Cys Gly Val IleVal Glu Leu Ile Glu Ser Lys Lys 50 55 60 Met Ala Gly Arg Ala Val Leu LeuAla Gly Pro Pro Gly Thr Gly 65 70 75 Lys Thr Ala Leu Ala Leu Ala Ile AlaGln Glu Leu Gly Ser Lys 80 85 90 Val Pro Phe Cys Pro Met Val Gly Ser GluVal Tyr Ser Thr Glu 95 100 105 Ile Lys Lys Thr Glu Val Leu Met Glu AsnPhe Arg Arg Ala Ile 110 115 120 Gly Leu Arg Ile Lys Glu Thr Lys Glu ValTyr Glu Gly Glu Val 125 130 135 Thr Glu Leu Thr Pro Cys Glu Thr Glu AsnPro Met Gly Gly Tyr 140 145 150 Gly Lys Thr Ile Ser His Val Ile Ile GlyLeu Lys Thr Ala Lys 155 160 165 Gly Thr Lys Gln Leu Lys Leu Asp Pro SerIle Phe Glu Ser Leu 170 175 180 Gln Lys Glu Arg Val Glu Ala Gly Asp ValIle Tyr Ile Glu Ala 185 190 195 Asn Ser Gly Ala Val Lys Arg Gln Gly ArgCys Asp Thr Tyr Ala 200 205 210 Thr Glu Phe Asp Leu Glu Ala Glu Glu TyrVal Pro Leu Pro Lys 215 220 225 Gly Asp Val His Lys Lys Lys Glu Ile IleGln Asp Val Thr Leu 230 235 240 His Asp Leu Asp Val Ala Asn Ala Arg ProGln Gly Gly Gln Asp 245 250 255 Ile Leu Ser Met Met Gly Gln Leu Met LysPro Lys Lys Thr Glu 260 265 270 Ile Thr Asp Lys Leu Arg Gly Glu Ile AsnLys Val Val Asn Lys 275 280 285 Tyr Ile Asp Gln Gly Ile Ala Glu Leu ValPro Gly Val Leu Phe 290 295 300 Val Asp Glu Val His Met Leu Asp Ile GluCys Phe Thr Tyr Leu 305 310 315 His Arg Ala Leu Glu Ser Ser Ile Ala ProIle Val Ile Phe Ala 320 325 330 Ser Asn Arg Gly Asn Cys Val Ile Arg GlyThr Glu Asp Ile Thr 335 340 345 Ser Pro His Gly Ile Pro Leu Asp Leu LeuAsp Arg Val Met Ile 350 355 360 Ile Arg Thr Met Leu Tyr Thr Pro Gln GluMet Lys Gln Ile Ile 365 370 375 Lys Ile Arg Ala Gln Thr Glu Gly Ile AsnIle Ser Glu Glu Ala 380 385 390 Leu Asn His Leu Gly Glu Ile Gly Thr LysThr Thr Leu Arg Tyr 395 400 405 Ser Val Gln Leu Leu Thr Pro Ala Asn LeuLeu Ala Lys Ile Asn 410 415 420 Gly Lys Asp Ser Ile Glu Lys Glu His ValGlu Glu Ile Ser Glu 425 430 435 Leu Phe Tyr Asp Ala Lys Ser Ser Ala LysIle Leu Ala Asp Gln 440 445 450 Gln Asp Lys Tyr Met Lys 455 210 aminoacids amino acid single linear THP1PLB01 10370 6 Met Val Leu Trp Leu LysGly Val Thr Phe Asn Val Thr Thr Val 5 10 15 Asp Thr Lys Arg Arg Thr GluThr Val Gln Lys Leu Cys Pro Gly 20 25 30 Gly Gln Leu Pro Phe Leu Leu TyrGly Thr Glu Val His Thr Asp 35 40 45 Thr Asn Lys Ile Glu Glu Phe Leu GluAla Val Leu Cys Pro Pro 50 55 60 Arg Tyr Pro Lys Leu Ala Ala Leu Asn ProGlu Ser Asn Thr Ala 65 70 75 Gly Leu Asp Ile Phe Ala Lys Phe Ser Ala TyrIle Lys Asn Ser 80 85 90 Asn Pro Ala Leu Asn Asp Asn Leu Glu Lys Gly LeuLeu Lys Ala 95 100 105 Leu Lys Val Leu Asp Asn Tyr Leu Thr Ser Pro LeuPro Glu Glu 110 115 120 Val Asp Glu Thr Ser Ala Glu Asp Glu Gly Val SerGln Arg Lys 125 130 135 Phe Leu Asp Gly Asn Glu Leu Thr Leu Ala Asp CysAsn Leu Leu 140 145 150 Pro Lys Leu His Ile Val Gln Val Val Cys Lys LysTyr Arg Gly 155 160 165 Phe Thr Ile Pro Glu Ala Phe Arg Gly Val His ArgTyr Leu Ser 170 175 180 Asn Ala Tyr Ala Arg Glu Glu Phe Ala Ser Thr CysPro Asp Asp 185 190 195 Glu Glu Ile Glu Leu Ala Tyr Glu Gln Val Ala LysAla Leu Lys 200 205 210 255 amino acids amino acid single linearTHP1NOB01 30137 7 Met Leu Gly Gln Leu Leu Pro His Thr Ala Arg Gly LeuGly Ala 5 10 15 Ala Glu Met Pro Gly Gln Gly Pro Gly Ser Asp Trp Thr GluArg 20 25 30 Ser Ser Ser Ala Glu Pro Pro Ala Val Ala Gly Thr Glu Gly Gly35 40 45 Gly Gly Gly Ser Ala Gly Tyr Ser Cys Tyr Gln Asn Ser Lys Gly 5055 60 Ser Asp Arg Ile Lys Asp Gly Tyr Lys Val Asn Ser His Ile Ala 65 7075 Lys Leu Gln Glu Leu Trp Lys Thr Pro Gln Asn Gln Thr Ile His 80 85 90Leu Ser Lys Ser Met Met Glu Ala Ser Phe Phe Lys His Pro Asp 95 100 105Leu Thr Thr Gly Gln Lys Arg Tyr Leu Cys Ser Ile Ala Lys Ile 110 115 120Tyr Asn Ala Asn Tyr Leu Lys Met Leu Met Lys Arg Gln Tyr Met 125 130 135His Val Leu Gln His Ser Ser Gln Lys Pro Gly Val Leu Thr His 140 145 150His Arg Ser Arg Leu Ser Ser Arg Tyr Ser Gln Lys Gln His Tyr 155 160 165Pro Cys Thr Thr Trp Arg His Gln Leu Glu Arg Glu Asp Ser Gly 170 175 180Ser Ser Asp Ile Ala Ala Ala Ser Ala Pro Glu Met Leu Ile Gln 185 190 195His Ser Leu Trp Arg Pro Val Arg Asn Lys Glu Gly Ile Lys Thr 200 205 210Gly Tyr Ala Ser Lys Thr Arg Cys Lys Ser Leu Lys Ile Phe Arg 215 220 225Arg Pro Arg Lys Leu Phe Met Gln Thr Val Ser Ser Asp Asp Ser 230 235 240Glu Ser His Met Ser Gly Glu Lys Lys Gly Arg Gly Phe Thr Thr 245 250 255188 amino acids amino acid single linear SYNORAB01 77180 8 Met Ala LeuAla Met Leu Val Leu Val Val Ser Pro Trp Ser Ala 5 10 15 Ala Arg Gly ValLeu Arg Asn Tyr Trp Glu Arg Leu Leu Arg Lys 20 25 30 Leu Pro Gln Ser ArgPro Gly Phe Pro Ser Pro Pro Trp Gly Pro 35 40 45 Ala Leu Ala Val Gln GlyPro Ala Met Phe Thr Glu Pro Ala Asn 50 55 60 Asp Thr Ser Gly Ser Lys GluAsn Ser Ser Leu Leu Asp Ser Ile 65 70 75 Phe Trp Met Ala Ala Pro Lys AsnArg Arg Thr Ile Glu Val Asn 80 85 90 Arg Cys Arg Arg Arg Asn Pro Gln LysLeu Ile Lys Val Lys Asn 95 100 105 Asn Ile Asp Val Cys Pro Glu Cys GlyHis Leu Lys Gln Lys His 110 115 120 Val Leu Cys Ala Tyr Cys Tyr Glu LysVal Cys Lys Glu Thr Ala 125 130 135 Glu Ile Arg Arg Gln Ile Gly Lys GlnGlu Gly Gly Pro Phe Lys 140 145 150 Ala Pro Thr Ile Glu Thr Val Val LeuTyr Thr Gly Glu Thr Pro 155 160 165 Ser Glu Gln Asp Gln Gly Lys Arg IleIle Glu Arg Asp Arg Lys 170 175 180 Arg Pro Ser Trp Phe Thr Gln Asn 185531 amino acids amino acid single linear PITUNOR01 98974 9 Met Ala ProThr Ile Gln Thr Gln Ala Gln Arg Glu Asp Gly His 5 10 15 Arg Pro Asn SerHis Arg Thr Leu Pro Glu Arg Ser Gly Val Val 20 25 30 Cys Arg Val Lys TyrCys Asn Ser Leu Pro Asp Ile Pro Phe Asp 35 40 45 Pro Lys Phe Ile Thr TyrPro Phe Asp Gln Asn Arg Phe Val Gln 50 55 60 Tyr Lys Ala Thr Ser Leu GluLys Gln His Lys His Asp Leu Leu 65 70 75 Thr Glu Pro Asp Leu Gly Val ThrIle Asp Leu Ile Asn Pro Asp 80 85 90 Thr Tyr Arg Ile Asp Pro Asn Val LeuLeu Asp Pro Ala Asp Glu 95 100 105 Lys Leu Leu Glu Glu Glu Ile Gln AlaPro Thr Ser Ser Lys Arg 110 115 120 Ser Gln Gln His Ala Lys Val Val ProTrp Met Arg Lys Thr Glu 125 130 135 Tyr Ile Ser Thr Glu Phe Asn Arg TyrGly Ile Ser Asn Glu Lys 140 145 150 Pro Glu Val Lys Ile Gly Val Ser ValLys Gln Gln Phe Thr Glu 155 160 165 Glu Glu Ile Tyr Lys Asp Arg Asp SerGln Ile Thr Ala Ile Glu 170 175 180 Lys Thr Phe Glu Asp Ala Gln Lys SerIle Ser Gln His Tyr Ser 185 190 195 Lys Pro Arg Val Thr Pro Val Glu ValMet Pro Val Phe Pro Asp 200 205 210 Phe Lys Met Trp Ile Asn Pro Cys AlaGln Val Ile Phe Asp Ser 215 220 225 Asp Pro Ala Pro Lys Asp Thr Ser GlyAla Ala Ala Leu Glu Met 230 235 240 Met Ser Gln Ala Met Ile Arg Gly MetMet Asp Glu Glu Gly Asn 245 250 255 Gln Phe Val Ala Tyr Phe Leu Pro ValGlu Glu Thr Leu Lys Lys 260 265 270 Arg Lys Arg Asp Gln Glu Glu Glu MetAsp Tyr Ala Pro Asp Asp 275 280 285 Val Tyr Asp Tyr Lys Ile Ala Arg GluTyr Asn Trp Asn Val Lys 290 295 300 Asn Lys Ala Ser Lys Gly Tyr Glu GluAsn Tyr Phe Phe Ile Phe 305 310 315 Arg Glu Gly Asp Gly Val Tyr Tyr AsnGlu Leu Glu Thr Arg Val 320 325 330 Arg Leu Ser Lys Arg Arg Ala Lys AlaGly Val Gln Ser Gly Thr 335 340 345 Asn Ala Leu Leu Val Val Lys His ArgAsp Met Asn Glu Lys Glu 350 355 360 Leu Glu Ala Gln Glu Ala Arg Lys AlaGln Leu Glu Asn His Glu 365 370 375 Pro Glu Glu Glu Glu Glu Glu Glu MetGlu Thr Glu Glu Lys Glu 380 385 390 Ala Gly Gly Ser Asp Glu Glu Gln GluLys Gly Ser Ser Ser Glu 395 400 405 Lys Glu Gly Ser Glu Asp Glu His SerGly Ser Glu Ser Glu Arg 410 415 420 Glu Glu Gly Asp Arg Asp Glu Ala SerAsp Lys Ser Gly Ser Gly 425 430 435 Glu Asp Glu Ser Ser Glu Asp Glu AlaArg Ala Ala Arg Asp Lys 440 445 450 Glu Glu Ile Phe Gly Ser Asp Ala AspSer Glu Asp Asp Ala Asp 455 460 465 Ser Asp Asp Glu Asp Arg Gly Gln AlaGln Gly Gly Ser Asp Asn 470 475 480 Asp Ser Asp Ser Gly Ser Asn Gly GlyGly Gln Arg Ser Arg Ser 485 490 495 His Ser Arg Ser Ala Ser Pro Phe ProSer Gly Ser Glu His Ser 500 505 510 Ala Gln Glu Asp Gly Ser Glu Ala AlaAla Ser Asp Ser Ser Glu 515 520 525 Ala Asp Ser Asp Ser Asp 530 348amino acids amino acid single linear MUSCNOT01 118160 10 Met Gly Gln GluGlu Glu Leu Leu Arg Ile Ala Lys Lys Leu Glu 5 10 15 Lys Met Val Ala ArgLys Asn Thr Glu Gly Ala Leu Asp Leu Leu 20 25 30 Lys Lys Leu His Ser CysGln Met Ser Ile Gln Leu Leu Gln Thr 35 40 45 Thr Arg Ile Gly Val Ala ValAsn Gly Val Arg Lys His Cys Ser 50 55 60 Asp Lys Glu Val Val Ser Leu AlaLys Val Leu Ile Lys Asn Trp 65 70 75 Lys Arg Leu Leu Asp Ser Pro Gly ProPro Lys Gly Glu Lys Gly 80 85 90 Glu Glu Arg Glu Lys Ala Lys Lys Lys GluLys Gly Leu Glu Cys 95 100 105 Ser Asp Trp Lys Pro Glu Ala Gly Leu SerPro Pro Arg Lys Lys 110 115 120 Arg Glu Asp Pro Lys Thr Arg Arg Asp SerVal Asp Ser Lys Ser 125 130 135 Ser Ala Ser Ser Ser Pro Lys Arg Pro SerVal Glu Arg Ser Asn 140 145 150 Ser Ser Lys Ser Lys Ala Glu Ser Pro LysThr Pro Ser Ser Pro 155 160 165 Leu Thr Pro Thr Phe Ala Ser Ser Met CysLeu Leu Ala Pro Cys 170 175 180 Tyr Leu Thr Gly Asp Ser Val Arg Asp LysCys Val Glu Met Leu 185 190 195 Ser Ala Ala Leu Lys Ala Asp Asp Asp TyrLys Asp Tyr Gly Val 200 205 210 Asn Cys Asp Lys Met Ala Ser Glu Ile GluAsp His Ile Tyr Gln 215 220 225 Glu Leu Lys Ser Thr Asp Met Lys Tyr ArgAsn Arg Val Arg Ser 230 235 240 Arg Ile Ser Asn Leu Lys Asp Pro Arg AsnPro Gly Leu Arg Arg 245 250 255 Asn Val Leu Ser Gly Ala Ile Ser Ala GlyLeu Ile Ala Lys Met 260 265 270 Thr Ala Glu Glu Met Ala Ser Asp Glu LeuArg Glu Leu Arg Asn 275 280 285 Ala Met Thr Gln Glu Ala Ile Arg Glu HisGln Met Ala Lys Thr 290 295 300 Gly Gly Thr Thr Thr Asp Leu Phe Gln CysSer Lys Cys Lys Lys 305 310 315 Lys Asn Cys Thr Tyr Asn Gln Val Gln ThrArg Ser Ala Asp Glu 320 325 330 Pro Met Thr Thr Phe Val Leu Cys Asn GluCys Gly Asn Arg Trp 335 340 345 Lys Phe Cys 393 amino acids amino acidsingle linear TLYMNOR01 140516 11 Met Arg Thr Leu Phe Asn Leu Leu TrpLeu Ala Leu Ala Cys Ser 5 10 15 Pro Val His Thr Thr Leu Ser Lys Ser AspAla Lys Lys Ala Ala 20 25 30 Ser Lys Thr Leu Leu Glu Lys Ser Gln Phe SerAsp Lys Pro Val 35 40 45 Gln Asp Arg Gly Leu Val Val Thr Asp Leu Lys AlaGlu Ser Val 50 55 60 Val Leu Glu His Arg Ser Tyr Cys Ser Ala Lys Ala ArgAsp Arg 65 70 75 His Phe Ala Gly Asp Val Leu Gly Tyr Val Thr Pro Trp AsnSer 80 85 90 His Gly Tyr Asp Val Thr Lys Val Phe Gly Ser Lys Phe Thr Gln95 100 105 Ile Ser Pro Val Trp Leu Gln Leu Lys Arg Arg Gly Arg Glu Met110 115 120 Phe Glu Val Thr Gly Leu His Asp Val Asp Gln Gly Trp Met Arg125 130 135 Ala Val Arg Lys His Ala Lys Gly Leu His Ile Val Pro Arg Leu140 145 150 Leu Phe Glu Asp Trp Thr Tyr Asp Asp Phe Arg Asn Val Leu Asp155 160 165 Ser Glu Asp Glu Ile Glu Glu Leu Ser Lys Thr Val Val Gln Val170 175 180 Ala Lys Asn Gln His Phe Asp Gly Phe Val Val Glu Val Trp Asn185 190 195 Gln Leu Leu Ser Gln Lys Arg Val Gly Leu Ile His Met Leu Thr200 205 210 His Leu Ala Glu Ala Leu His Gln Ala Arg Leu Leu Ala Leu Leu215 220 225 Val Ile Pro Pro Ala Ile Thr Pro Gly Thr Asp Gln Leu Gly Met230 235 240 Phe Thr His Lys Glu Phe Glu Gln Leu Ala Pro Val Leu Asp Gly245 250 255 Phe Ser Leu Met Thr Tyr Asp Tyr Ser Thr Ala His Gln Pro Gly260 265 270 Pro Asn Ala Pro Leu Ser Trp Val Arg Ala Cys Val Gln Val Leu275 280 285 Asp Pro Lys Ser Lys Trp Arg Ser Lys Ile Leu Leu Gly Leu Asn290 295 300 Phe Tyr Gly Met Asp Tyr Ala Thr Ser Lys Asp Ala Arg Glu Pro305 310 315 Val Val Gly Ala Arg Tyr Ile Gln Thr Leu Lys Asp His Arg Pro320 325 330 Arg Met Val Trp Asp Ser Gln Ala Ser Glu His Phe Phe Glu Tyr335 340 345 Lys Lys Ser Arg Ser Gly Arg His Val Val Phe Tyr Pro Thr Leu350 355 360 Lys Ser Leu Gln Val Arg Leu Glu Leu Ala Arg Glu Leu Gly Val365 370 375 Gly Val Ser Ile Trp Glu Leu Gly Gln Gly Leu Asp Tyr Phe Tyr380 385 390 Asp Leu Leu 320 amino acids amino acid single linearSPLNNOT02 207452 12 Met Val Gly Tyr Asp Pro Lys Pro Asp Gly Arg Asn AsnThr Lys 5 10 15 Phe Gln Val Ala Val Ala Gly Ser Val Ser Gly Leu Val ThrArg 20 25 30 Ala Leu Ile Ser Pro Phe Asp Val Ile Lys Ile Arg Phe Gln Leu35 40 45 Gln His Glu Arg Leu Ser Arg Ser Asp Pro Ser Ala Lys Tyr His 5055 60 Gly Ile Leu Gln Ala Ser Arg Gln Ile Leu Gln Glu Glu Gly Pro 65 7075 Thr Ala Phe Trp Lys Gly His Val Pro Ala Gln Ile Leu Ser Ile 80 85 90Gly Tyr Gly Ala Val Gln Phe Leu Ser Phe Glu Met Leu Thr Glu 95 100 105Leu Val His Arg Gly Ser Val Tyr Asp Ala Arg Glu Phe Ser Val 110 115 120His Phe Val Cys Gly Gly Leu Ala Ala Cys Met Ala Thr Leu Thr 125 130 135Val His Pro Val Asp Val Leu Arg Thr Arg Phe Ala Ala Gln Gly 140 145 150Glu Pro Lys Val Tyr Asn Thr Leu Arg His Ala Val Gly Thr Met 155 160 165Tyr Arg Ser Glu Gly Pro Gln Val Phe Tyr Lys Gly Leu Ala Pro 170 175 180Thr Leu Ile Ala Ile Phe Pro Tyr Ala Gly Leu Gln Phe Ser Cys 185 190 195Tyr Ser Ser Leu Lys His Leu Tyr Lys Trp Ala Ile Pro Ala Glu 200 205 210Gly Lys Lys Asn Glu Asn Leu Gln Asn Leu Leu Cys Gly Ser Gly 215 220 225Ala Gly Val Ile Ser Lys Thr Leu Thr Tyr Pro Leu Asp Leu Phe 230 235 240Lys Lys Arg Leu Gln Val Gly Gly Phe Glu His Ala Arg Ala Ala 245 250 255Phe Gly Gln Val Arg Arg Tyr Lys Gly Leu Met Asp Cys Ala Lys 260 265 270Gln Val Leu Gln Lys Glu Gly Ala Leu Gly Phe Phe Lys Gly Leu 275 280 285Ser Pro Ser Leu Leu Lys Ala Ala Leu Ser Thr Gly Phe Met Phe 290 295 300Phe Ser Tyr Glu Phe Phe Cys Asn Val Phe His Cys Met Asn Arg 305 310 315Thr Ala Ser Gln Arg 320 343 amino acids amino acid single linearSPLNNOT02 208836 13 Met Ala Glu Gln Leu Ser Pro Gly Lys Ala Val Asp GlnVal Cys 5 10 15 Thr Phe Leu Phe Lys Lys Pro Gly Arg Lys Gly Ala Ala GlyArg 20 25 30 Arg Lys Arg Pro Ala Cys Asp Pro Glu Pro Gly Glu Ser Gly Ser35 40 45 Ser Ser Asp Glu Gly Cys Thr Val Val Arg Pro Glu Lys Lys Arg 5055 60 Val Thr His Asn Pro Met Met Gln Lys Thr Arg Asp Ser Gly Lys 65 7075 Gln Lys Ala Ala Tyr Gly Asp Leu Ser Ser Glu Glu Glu Glu Glu 80 85 90Asn Glu Pro Glu Ser Leu Gly Val Val Tyr Lys Ser Thr Arg Ser 95 100 105Ala Lys Pro Val Gly Pro Glu Asp Met Gly Ala Thr Ala Val Tyr 110 115 120Glu Leu Asp Thr Glu Lys Glu Arg Asp Ala Gln Ala Ile Phe Glu 125 130 135Arg Ser Gln Lys Ile Gln Glu Glu Leu Arg Gly Lys Glu Asp Asp 140 145 150Lys Ile Tyr Arg Gly Ile Asn Asn Tyr Gln Lys Tyr Met Lys Pro 155 160 165Lys Asp Thr Ser Met Gly Asn Ala Ser Ser Gly Met Val Arg Lys 170 175 180Gly Pro Ile Arg Ala Pro Glu His Leu Arg Ala Thr Val Arg Trp 185 190 195Asp Tyr Gln Pro Asp Ile Cys Lys Asp Tyr Lys Glu Thr Gly Phe 200 205 210Cys Gly Phe Gly Asp Ser Cys Lys Phe Leu His Asp Arg Ser Asp 215 220 225Tyr Lys His Gly Trp Gln Ile Glu Arg Glu Leu Asp Glu Gly Arg 230 235 240Tyr Gly Val Tyr Glu Asp Glu Asn Tyr Glu Val Gly Ser Asp Asp 245 250 255Glu Glu Ile Pro Phe Lys Cys Phe Ile Cys Arg Gln Ser Phe Gln 260 265 270Asn Pro Val Val Thr Lys Cys Arg His Tyr Phe Cys Glu Ser Cys 275 280 285Ala Leu Gln His Phe Arg Thr Thr Pro Arg Cys Tyr Val Cys Asp 290 295 300Gln Gln Thr Asn Gly Val Phe Asn Pro Ala Lys Glu Leu Ile Ala 305 310 315Lys Leu Glu Lys His Arg Ala Thr Gly Glu Gly Gly Ala Ser Asp 320 325 330Leu Pro Glu Asp Pro Asp Glu Asp Ala Ile Pro Ile Thr 335 340 368 aminoacids amino acid single linear MMLR3DT01 569710 14 Met Ser Ala Gln SerVal Glu Glu Asp Ser Ile Leu Ile Ile Pro 5 10 15 Thr Pro Asp Glu Glu GluLys Ile Leu Arg Val Lys Leu Glu Glu 20 25 30 Asp Pro Asp Gly Glu Glu GlySer Ser Ile Pro Trp Asn His Leu 35 40 45 Pro Asp Pro Glu Ile Phe Arg GlnArg Phe Arg Gln Phe Gly Tyr 50 55 60 Gln Asp Ser Pro Gly Pro Arg Glu AlaVal Ser Gln Leu Arg Glu 65 70 75 Leu Cys Arg Leu Trp Leu Arg Pro Glu ThrHis Thr Lys Glu Gln 80 85 90 Ile Leu Glu Leu Val Val Leu Glu Gln Phe ValAla Ile Leu Pro 95 100 105 Lys Glu Leu Gln Thr Trp Val Arg Asp His HisPro Glu Asn Gly 110 115 120 Glu Glu Ala Val Thr Val Leu Glu Asp Leu GluSer Glu Leu Asp 125 130 135 Asp Pro Gly Gln Pro Val Ser Leu Arg Arg ArgLys Arg Glu Val 140 145 150 Leu Val Glu Asp Met Val Ser Gln Glu Glu AlaGln Gly Leu Pro 155 160 165 Ser Ser Glu Leu Asp Ala Val Glu Asn Gln LeuLys Trp Ala Ser 170 175 180 Trp Glu Leu His Ser Leu Arg His Cys Asp AspAsp Gly Arg Thr 185 190 195 Glu Asn Gly Ala Leu Ala Pro Lys Gln Glu LeuPro Ser Ala Leu 200 205 210 Glu Ser His Glu Val Pro Gly Thr Leu Ser MetGly Val Pro Gln 215 220 225 Ile Phe Lys Tyr Gly Glu Thr Cys Phe Pro LysGly Arg Phe Glu 230 235 240 Arg Lys Arg Asn Pro Ser Arg Lys Lys Gln HisIle Cys Asp Glu 245 250 255 Cys Gly Lys His Phe Ser Gln Gly Ser Ala LeuIle Leu His Gln 260 265 270 Arg Ile His Ser Gly Glu Lys Pro Tyr Gly CysVal Glu Cys Gly 275 280 285 Lys Ala Phe Ser Arg Ser Ser Ile Leu Val GlnHis Gln Arg Val 290 295 300 His Thr Gly Glu Lys Pro Tyr Lys Cys Leu GluCys Gly Lys Ala 305 310 315 Phe Ser Gln Asn Ser Gly Leu Ile Asn His GlnArg Ile His Thr 320 325 330 Gly Glu Lys Pro Tyr Glu Cys Val Gln Cys GlyLys Ser Tyr Ser 335 340 345 Gln Ser Ser Asn Leu Phe Arg His Gln Arg ArgHis Asn Ala Glu 350 355 360 Lys Leu Leu Asn Val Val Lys Val 365 158amino acids amino acid single linear BRSTTUT01 606742 15 Met Glu Gly ProArg Arg Gly Pro Glu Val Gly Gly Phe Cys Lys 5 10 15 Tyr Arg Leu Leu ArgVal Ser Arg Ala Leu Cys His Asp Thr Ser 20 25 30 Leu Gly Leu Thr Trp LeuArg Thr Cys Ser Val Arg Gly Phe Val 35 40 45 Arg Thr Leu Pro Phe Cys LeuLys Leu Lys Ala Lys Glu Asn Asp 50 55 60 Arg Arg Leu Arg Thr Glu Leu ThrLeu Ala Pro Gly Trp Glu Ala 65 70 75 Ala Ala Leu Leu Asp Ala Thr Tyr CysLys Trp Pro Glu Tyr Gln 80 85 90 Arg Gly Gly Phe His Gly Gln Met His SerArg Cys Leu Pro Leu 95 100 105 His Leu Asp His Leu Val Val Phe Lys PheLeu Val Pro Glu Ala 110 115 120 Lys Ser Thr Thr Cys Leu Leu Val Thr CysLeu Pro Ala Val Val 125 130 135 Val Asp Val Leu Ala Gly Arg Phe Gly IleSer His Gln Ser Phe 140 145 150 Cys Thr Val Leu Val Ser Ser Ile 155 334amino acids amino acid single linear COLNNOT01 611135 16 Met Ala Thr ArgGln Arg Glu Ser Ser Ile Thr Ser Cys Cys Ser 5 10 15 Thr Ser Ser Cys AspAla Asp Asp Glu Gly Val Arg Gly Thr Cys 20 25 30 Glu Asp Ala Ser Leu CysLys Arg Phe Ala Val Ser Ile Gly Tyr 35 40 45 Trp His Asp Pro Tyr Ile GlnHis Phe Val Arg Leu Ser Lys Glu 50 55 60 Arg Lys Ala Pro Glu Ile Asn ArgGly Tyr Phe Ala Arg Val His 65 70 75 Gly Val Ser Gln Leu Ile Lys Ala PheLeu Arg Lys Thr Glu Cys 80 85 90 His Cys Gln Ile Val Asn Leu Gly Ala GlyMet Asp Thr Thr Phe 95 100 105 Trp Arg Leu Lys Asp Glu Asp Leu Leu ProSer Lys Tyr Phe Glu 110 115 120 Val Asp Phe Pro Met Ile Val Thr Arg LysLeu His Ser Ile Lys 125 130 135 Cys Lys Pro Pro Leu Ser Ser Pro Ile LeuGlu Leu His Ser Glu 140 145 150 Asp Thr Leu Gln Met Asp Gly His Ile LeuAsp Ser Lys Arg Tyr 155 160 165 Ala Val Ile Gly Ala Asp Leu Arg Asp LeuSer Glu Leu Glu Glu 170 175 180 Lys Leu Lys Lys Cys Asn Met Asn Thr GlnLeu Pro Thr Leu Leu 185 190 195 Ile Ala Glu Cys Val Leu Val Tyr Met ThrPro Glu Gln Ser Ala 200 205 210 Asn Leu Leu Lys Trp Ala Ala Asn Ser PheGlu Arg Ala Met Phe 215 220 225 Ile Asn Tyr Glu Gln Val Asn Met Gly AspArg Phe Gly Gln Ile 230 235 240 Met Ile Glu Asn Leu Arg Arg Arg Gln CysAsp Leu Ala Gly Val 245 250 255 Glu Thr Cys Lys Ser Leu Glu Ser Gln LysGlu Arg Leu Leu Ser 260 265 270 Asn Gly Trp Glu Thr Ala Ser Ala Val AspMet Met Glu Leu Tyr 275 280 285 Asn Arg Leu Pro Arg Ala Glu Val Ser ArgIle Glu Ser Leu Glu 290 295 300 Phe Leu Asp Glu Met Glu Leu Leu Glu GlnLeu Met Arg His Tyr 305 310 315 Cys Leu Cys Trp Ala Thr Lys Gly Gly AsnGlu Leu Gly Leu Lys 320 325 330 Glu Ile Thr Tyr 488 amino acids aminoacid single linear BRSTNOT03 641127 17 Met Ala Ser Thr Ile Thr Gly SerGln Asp Cys Ile Val Asn His 5 10 15 Arg Gly Glu Val Asp Gly Glu Pro GluLeu Asp Ile Ser Pro Cys 20 25 30 Gln Gln Trp Gly Glu Ala Ser Ser Pro IleSer Arg Asn Arg Asp 35 40 45 Ser Val Met Thr Leu Gln Ser Gly Cys Phe GluAsn Ile Glu Ser 50 55 60 Glu Thr Tyr Leu Pro Leu Lys Val Ser Ser Gln IleAsp Thr Gln 65 70 75 Asp Ser Ser Val Lys Phe Cys Lys Asn Glu Pro Gln AspHis Gln 80 85 90 Glu Ser Arg Arg Leu Phe Val Met Glu Glu Ser Thr Glu ArgLys 95 100 105 Val Ile Lys Gly Glu Ser Cys Ser Glu Asn Leu Gln Val LysLeu 110 115 120 Val Ser Asp Gly Gln Glu Leu Ala Ser Pro Leu Leu Asn GlyGlu 125 130 135 Ala Thr Cys Gln Asn Gly Gln Leu Lys Glu Ser Leu Asp ProIle 140 145 150 Asp Cys Asn Cys Lys Asp Ile His Gly Trp Lys Ser Gln ValVal 155 160 165 Ser Cys Ser Gln Gln Arg Gly His Thr Glu Glu Lys Pro CysAsp 170 175 180 His Asn Asn Cys Gly Lys Ile Leu Asn Thr Ser Pro Asp GlyHis 185 190 195 Pro Tyr Glu Lys Ile His Thr Ala Glu Lys Gln Tyr Glu GlySer 200 205 210 Gln Cys Gly Lys Asn Phe Ser Gln Ser Ser Glu Leu Leu LeuHis 215 220 225 Gln Arg Asp His Thr Glu Glu Lys Pro Tyr Lys Cys Glu GlnCys 230 235 240 Gly Lys Gly Phe Thr Arg Ser Ser Ser Leu Leu Ile His GlnAla 245 250 255 Val His Thr Asp Glu Lys Pro Tyr Lys Cys Asp Lys Cys GlyLys 260 265 270 Gly Phe Thr Arg Ser Ser Ser Leu Leu Ile His His Ala ValHis 275 280 285 Thr Gly Glu Lys Pro Tyr Lys Cys Asp Lys Cys Gly Lys GlyPhe 290 295 300 Ser Gln Ser Ser Lys Leu His Ile His Gln Arg Val His ThrGly 305 310 315 Glu Lys Pro Tyr Glu Cys Glu Glu Cys Gly Met Ser Phe SerGln 320 325 330 Arg Ser Asn Leu His Ile His Gln Arg Val His Thr Gly GluArg 335 340 345 Pro Tyr Lys Cys Gly Glu Cys Gly Lys Gly Phe Ser Gln SerSer 350 355 360 Asn Leu His Ile His Arg Cys Ile His Thr Gly Glu Lys ProTyr 365 370 375 Gln Cys Tyr Glu Cys Gly Lys Gly Phe Ser Gln Ser Ser AspLeu 380 385 390 Arg Ile His Leu Arg Val His Thr Gly Glu Lys Pro Tyr HisCys 395 400 405 Gly Lys Cys Gly Lys Gly Phe Ser Gln Ser Ser Lys Leu LeuIle 410 415 420 His Gln Arg Val His Thr Gly Glu Lys Pro Tyr Glu Cys SerLys 425 430 435 Cys Gly Lys Gly Phe Ser Gln Ser Ser Asn Leu His Ile HisGln 440 445 450 Arg Val His Lys Arg Asp Pro Arg Ala His Pro Gly Leu HisSer 455 460 465 Ala His Thr Val Asn Thr Val Lys Tyr Leu Val Ser Leu LeuLeu 470 475 480 Tyr Ile Leu Gln Arg Arg Glu Met 485 255 amino acidsamino acid single linear LUNGTUT02 691768 18 Met Gly Arg Asn Lys Lys LysLys Arg Asp Gly Asp Asp Arg Arg 5 10 15 Pro Arg Leu Val Leu Ser Phe AspGlu Glu Lys Arg Arg Glu Tyr 20 25 30 Leu Thr Gly Phe His Lys Arg Lys ValGlu Arg Lys Lys Ala Ala 35 40 45 Ile Glu Glu Ile Lys Gln Arg Leu Lys GluGlu Gln Arg Lys Leu 50 55 60 Arg Glu Glu Arg His Gln Glu Tyr Leu Lys MetLeu Ala Glu Arg 65 70 75 Glu Glu Ala Leu Glu Glu Ala Asp Glu Leu Asp ArgLeu Val Thr 80 85 90 Ala Lys Thr Glu Ser Val Gln Tyr Asp His Pro Asn HisThr Val 95 100 105 Thr Val Thr Thr Ile Ser Asp Leu Asp Leu Ser Gly AlaArg Leu 110 115 120 Leu Gly Leu Thr Pro Pro Glu Gly Gly Ala Gly Asp ArgSer Glu 125 130 135 Glu Glu Ala Ser Ser Thr Glu Lys Pro Thr Lys Ala LeuPro Arg 140 145 150 Lys Ser Arg Asp Pro Leu Leu Ser Gln Arg Ile Ser SerLeu Thr 155 160 165 Ala Ser Leu His Ala His Ser Arg Lys Lys Val Lys ArgLys His 170 175 180 Ser Arg Arg Ala Gln Asp Ser Lys Lys Pro Pro Lys GlyPro Ser 185 190 195 Tyr Gln Gln Arg Pro Ser Gly Ala Val Phe Thr Gly LysAla Pro 200 205 210 Ala Gln Arg Gly Asn Xaa Arg Xaa Glu Asn Glu Ala GlyCys Pro 215 220 225 His Ser Lys Ala Xaa Arg Gly Xaa Cys Ser Leu Gly SerAla Leu 230 235 240 Ala Val Pro Leu Leu Xaa Pro Ala Leu Xaa Leu Lys ValLeu Pro 245 250 255 351 amino acids amino acid single linear SYNOOAT01724157 19 Met Ala Asp Gln Asp Pro Ala Gly Ile Ser Pro Leu Gln Gln Met 510 15 Val Ala Ser Gly Thr Gly Ala Val Val Thr Ser Leu Phe Met Thr 20 2530 Pro Leu Asp Val Val Lys Val Arg Leu Gln Ser Gln Arg Pro Ser 35 40 45Met Ala Ser Glu Leu Met Pro Ser Ser Arg Leu Trp Ser Leu Ser 50 55 60 TyrThr Lys Trp Lys Cys Leu Leu Tyr Cys Asn Gly Val Leu Glu 65 70 75 Pro LeuTyr Leu Cys Pro Asn Gly Ala Arg Cys Ala Thr Trp Phe 80 85 90 Gln Asp ProThr Arg Phe Thr Gly Thr Met Asp Ala Phe Val Lys 95 100 105 Ile Val ArgHis Glu Gly Thr Arg Thr Leu Trp Ser Gly Leu Pro 110 115 120 Ala Thr LeuVal Met Thr Val Pro Ala Thr Ala Ile Tyr Phe Thr 125 130 135 Ala Tyr AspGln Leu Lys Ala Phe Leu Cys Gly Arg Ala Leu Thr 140 145 150 Ser Asp LeuTyr Ala Pro Met Val Ala Gly Ala Leu Ala Arg Leu 155 160 165 Gly Thr ValThr Val Ile Ser Pro Leu Glu Leu Met Arg Thr Lys 170 175 180 Leu Gln AlaGln His Val Ser Tyr Arg Glu Leu Gly Ala Cys Val 185 190 195 Arg Thr AlaVal Ala Gln Gly Gly Trp Arg Ser Leu Trp Leu Gly 200 205 210 Trp Gly ProThr Ala Leu Arg Asp Val Pro Phe Ser Ala Leu Tyr 215 220 225 Trp Phe AsnTyr Glu Leu Val Lys Ser Trp Leu Asn Gly Leu Arg 230 235 240 Pro Lys AspGln Thr Ser Val Gly Met Ser Phe Val Ala Gly Gly 245 250 255 Ile Ser GlyThr Val Ala Ala Val Leu Thr Leu Pro Phe Asp Val 260 265 270 Val Lys ThrGln Arg Gln Val Ala Leu Gly Ala Met Glu Ala Val 275 280 285 Arg Val AsnPro Leu His Val Asp Ser Thr Trp Leu Leu Leu Arg 290 295 300 Arg Ile ArgAla Glu Ser Gly Thr Lys Gly Leu Phe Ala Gly Phe 305 310 315 Leu Pro ArgIle Ile Lys Ala Ala Pro Ser Cys Ala Ile Met Ile 320 325 330 Ser Thr TyrGlu Phe Gly Lys Ser Phe Phe Gln Arg Leu Asn Gln 335 340 345 Asp Arg LeuLeu Gly Gly 350 535 amino acids amino acid single linear BRAITUT03864683 20 Met Ser Glu Gly Glu Ser Gln Thr Val Leu Ser Ser Gly Ser Asp 510 15 Pro Lys Val Glu Ser Ser Ser Ser Ala Pro Gly Leu Thr Ser Val 20 2530 Ser Pro Pro Val Thr Ser Thr Thr Ser Ala Ala Ser Pro Glu Glu 35 40 45Glu Glu Glu Ser Glu Asp Glu Ser Glu Ile Leu Glu Glu Ser Pro 50 55 60 CysGly Arg Trp Gln Lys Arg Arg Glu Glu Val Asn Gln Arg Asn 65 70 75 Val ProGly Ile Asp Ser Ala Tyr Leu Ala Met Asp Thr Glu Glu 80 85 90 Gly Val GluVal Val Trp Asn Glu Val Gln Phe Ser Glu Arg Lys 95 100 105 Asn Tyr LysLeu Gln Glu Glu Lys Val Arg Ala Val Phe Asp Asn 110 115 120 Leu Ile GlnLeu Glu His Leu Asn Ile Val Lys Phe His Lys Tyr 125 130 135 Trp Ala AspIle Lys Glu Asn Lys Ala Arg Val Ile Phe Ile Thr 140 145 150 Glu Tyr MetSer Ser Gly Ser Leu Lys Gln Phe Leu Lys Lys Thr 155 160 165 Lys Lys AsnHis Lys Thr Met Asn Glu Lys Ala Trp Lys Arg Trp 170 175 180 Cys Thr GlnIle Leu Ser Ala Leu Ser Tyr Leu His Ser Cys Asp 185 190 195 Pro Pro IleIle His Gly Asn Leu Thr Cys Asp Thr Ile Phe Ile 200 205 210 Gln His AsnGly Leu Ile Lys Ile Gly Ser Val Ala Pro Asp Thr 215 220 225 Ile Asn AsnHis Val Lys Thr Cys Arg Glu Glu Gln Lys Asn Leu 230 235 240 His Phe PheAla Pro Glu Tyr Gly Glu Val Thr Asn Val Thr Thr 245 250 255 Ala Val AspIle Tyr Ser Phe Gly Met Cys Ala Leu Glu Met Ala 260 265 270 Val Leu GluIle Gln Gly Asn Gly Glu Ser Ser Tyr Val Pro Gln 275 280 285 Glu Ala IleSer Ser Ala Ile Gln Leu Leu Glu Asp Pro Leu Gln 290 295 300 Arg Glu PheIle Gln Lys Cys Leu Gln Ser Glu Pro Ala Arg Arg 305 310 315 Pro Thr AlaArg Glu Leu Leu Phe His Pro Ala Leu Phe Glu Val 320 325 330 Pro Ser LeuLys Leu Leu Ala Ala His Cys Ile Val Gly His Gln 335 340 345 His Met IlePro Glu Asn Ala Leu Glu Glu Ile Thr Lys Asn Met 350 355 360 Asp Thr SerAla Val Leu Ala Glu Ile Pro Ala Gly Pro Gly Arg 365 370 375 Glu Pro ValGln Thr Leu Tyr Ser Gln Ser Pro Ala Leu Glu Leu 380 385 390 Asp Lys PheLeu Glu Asp Val Arg Asn Gly Ile Tyr Pro Leu Thr 395 400 405 Ala Phe GlyLeu Pro Arg Pro Gln Gln Pro Gln Gln Glu Glu Val 410 415 420 Thr Ser ProVal Val Pro Pro Ser Val Lys Thr Pro Thr Pro Glu 425 430 435 Pro Ala GluVal Glu Thr Arg Lys Val Val Leu Met Gln Cys Asn 440 445 450 Ile Glu SerVal Glu Glu Gly Val Lys His His Leu Thr Leu Leu 455 460 465 Leu Lys LeuGlu Asp Lys Leu Asn Arg His Leu Ser Cys Asp Leu 470 475 480 Met Pro AsnGlu Asn Ile Pro Glu Leu Ala Ala Glu Leu Val Gln 485 490 495 Leu Gly PheIle Ser Glu Ala Asp Gln Ser Arg Leu Thr Ser Leu 500 505 510 Leu Glu GluThr Leu Asn Lys Phe Asn Phe Ala Arg Asn Ser Thr 515 520 525 Leu Asn SerAla Ala Val Thr Val Ser Ser 530 535 201 amino acids amino acid singlelinear CERVNOT01 933353 21 Met Ala Ala Thr Ala Leu Leu Glu Ala Gly LeuAla Arg Val Leu 5 10 15 Phe Tyr Pro Thr Leu Leu Tyr Thr Leu Phe Arg GlyLys Val Pro 20 25 30 Gly Arg Ala His Arg Asp Trp Tyr His Arg Ile Asp ProThr Val 35 40 45 Leu Leu Gly Ala Leu Pro Leu Arg Ser Leu Thr Arg Gln LeuVal 50 55 60 Gln Asp Glu Asn Val Arg Gly Val Ile Thr Met Asn Glu Glu Tyr65 70 75 Glu Thr Arg Phe Leu Cys Asn Ser Ser Gln Glu Trp Lys Arg Leu 8085 90 Gly Val Glu Gln Leu Arg Leu Ser Thr Val Asp Met Thr Gly Ile 95 100105 Pro Thr Leu Asp Asn Leu Gln Lys Gly Val Gln Phe Ala Leu Lys 110 115120 Tyr Gln Ser Leu Gly Gln Cys Val Tyr Val His Cys Lys Ala Gly 125 130135 Arg Ser Arg Ser Ala Thr Met Val Ala Ala Tyr Leu Ile Gln Val 140 145150 His Lys Trp Ser Pro Glu Glu Ala Val Arg Ala Ile Ala Lys Ile 155 160165 Arg Ser Tyr Ile His Ile Arg Pro Gly Gln Leu Asp Val Leu Lys 170 175180 Glu Phe His Lys Gln Ile Thr Ala Arg Ala Thr Lys Asp Gly Thr 185 190195 Phe Val Ile Ser Lys Thr 200 239 amino acids amino acid single linearLATRTUT02 1404643 22 Met Ala Tyr Gln Ser Leu Arg Leu Glu Tyr Leu Gln IlePro Pro 5 10 15 Val Ser Arg Ala Tyr Thr Thr Ala Cys Val Leu Thr Thr AlaAla 20 25 30 Val Gln Leu Glu Leu Ile Thr Pro Phe Gln Leu Tyr Phe Asn Pro35 40 45 Glu Leu Ile Phe Lys His Phe Gln Ile Trp Arg Leu Ile Thr Asn 5055 60 Phe Leu Phe Phe Gly Pro Val Gly Phe Asn Phe Leu Phe Asn Met 65 7075 Ile Phe Leu Tyr Arg Tyr Cys Arg Met Leu Glu Glu Gly Ser Phe 80 85 90Arg Gly Arg Thr Ala Asp Phe Val Phe Met Phe Leu Phe Gly Gly 95 100 105Phe Leu Met Thr Leu Phe Gly Leu Phe Val Ser Leu Val Phe Leu 110 115 120Gly Gln Ala Phe Thr Ile Met Leu Val Tyr Val Trp Ser Arg Arg 125 130 135Asn Pro Tyr Val Arg Met Asn Phe Phe Gly Leu Leu Asn Phe Gln 140 145 150Ala Pro Phe Leu Pro Trp Val Leu Met Gly Phe Ser Leu Leu Leu 155 160 165Gly Asn Ser Ile Ile Val Asp Leu Leu Gly Ile Ala Val Gly His 170 175 180Ile Tyr Phe Phe Leu Glu Asp Val Phe Pro Asn Gln Pro Gly Gly 185 190 195Ile Arg Ile Leu Lys Thr Pro Ser Ile Leu Lys Ala Ile Phe Asp 200 205 210Thr Pro Asp Glu Asp Pro Asn Tyr Asn Pro Leu Pro Glu Glu Arg 215 220 225Pro Gly Gly Phe Ala Trp Gly Glu Gly Gln Arg Leu Gly Gly 230 235 244amino acids amino acid single linear SPLNNOT04 1561587 23 Met Met ArgThr Gln Cys Leu Leu Gly Leu Arg Thr Phe Val Ala 5 10 15 Phe Ala Ala LysLeu Trp Ser Phe Phe Ile Tyr Leu Leu Arg Arg 20 25 30 Gln Ile Arg Thr ValIle Gln Tyr Gln Thr Val Arg Tyr Asp Ile 35 40 45 Leu Pro Leu Ser Pro ValSer Arg Asn Arg Leu Ala Gln Val Lys 50 55 60 Arg Lys Ile Leu Val Leu AspLeu Asp Glu Thr Leu Ile His Ser 65 70 75 His His Asp Gly Val Leu Arg ProThr Val Arg Pro Gly Thr Pro 80 85 90 Pro Asp Phe Ile Leu Lys Val Val IleAsp Lys His Pro Val Arg 95 100 105 Phe Phe Val His Lys Arg Pro His ValAsp Phe Phe Leu Glu Val 110 115 120 Val Ser Gln Trp Tyr Glu Leu Val ValPhe Thr Ala Ser Met Glu 125 130 135 Ile Tyr Gly Ser Ala Val Ala Asp LysLeu Asp Asn Ser Arg Ser 140 145 150 Ile Leu Lys Arg Arg Tyr Tyr Arg GlnHis Cys Thr Leu Glu Leu 155 160 165 Gly Ser Tyr Ile Lys Asp Leu Ser ValVal His Ser Asp Leu Ser 170 175 180 Ser Ile Val Ile Leu Asp Asn Ser ProGly Ala Tyr Arg Ser His 185 190 195 Pro Asp Asn Ala Ile Pro Ile Lys SerTrp Phe Ser Asp Pro Ser 200 205 210 Asp Thr Ala Leu Leu Asn Leu Leu ProMet Leu Asp Ala Leu Arg 215 220 225 Phe Thr Ala Asp Val Arg Ser Val LeuSer Arg Asn Leu His Gln 230 235 240 His Arg Leu Trp 431 amino acidsamino acid single linear UTRSNOT05 1568361 24 Met Ser Ser Val Glu GluAsp Asp Tyr Asp Thr Leu Thr Asp Ile 5 10 15 Asp Ser Asp Lys Asn Val IleArg Thr Lys Gln Tyr Leu Tyr Val 20 25 30 Ala Asp Leu Ala Arg Lys Asp LysArg Val Leu Arg Lys Lys Tyr 35 40 45 Gln Ile Tyr Phe Trp Asn Ile Ala ThrIle Ala Val Phe Tyr Ala 50 55 60 Leu Pro Val Val Gln Leu Val Ile Thr TyrGln Thr Val Val Asn 65 70 75 Val Thr Gly Asn Gln Asp Ile Cys Tyr Tyr AsnPhe Leu Cys Ala 80 85 90 His Pro Leu Gly Asn Leu Ser Ala Phe Asn Asn IleLeu Ser Asn 95 100 105 Leu Gly Tyr Ile Leu Leu Gly Leu Leu Phe Leu LeuIle Ile Leu 110 115 120 Gln Arg Glu Ile Asn His Asn Arg Ala Leu Leu ArgAsn Asp Leu 125 130 135 Cys Ala Leu Glu Cys Gly Ile Pro Lys His Phe GlyLeu Phe Tyr 140 145 150 Ala Met Gly Thr Ala Leu Met Met Glu Gly Leu LeuSer Ala Cys 155 160 165 Tyr His Val Cys Pro Asn Tyr Thr Asn Phe Gln PheAsp Thr Ser 170 175 180 Phe Met Tyr Met Ile Ala Gly Leu Cys Met Leu LysLeu Tyr Gln 185 190 195 Lys Arg His Pro Asp Ile Asn Ala Ser Ala Tyr SerAla Tyr Ala 200 205 210 Cys Leu Ala Ile Val Ile Phe Xaa Ser Val Leu GlyVal Val Phe 215 220 225 Gly Lys Gly Asn Thr Ala Phe Trp Ile Val Phe SerIle Ile His 230 235 240 Ile Ile Ala Thr Leu Leu Leu Ser Thr Gln Leu TyrTyr Met Gly 245 250 255 Arg Trp Lys Leu Asp Ser Gly Ile Phe Arg Arg IleLeu His Val 260 265 270 Leu Tyr Thr Asp Cys Ile Arg Gln Cys Ser Gly ProLeu Tyr Val 275 280 285 Asp Arg Met Val Leu Leu Val Met Gly Asn Val IleAsn Trp Ser 290 295 300 Leu Ala Ala Tyr Gly Leu Ile Met Arg Pro Asn AspPhe Ala Ser 305 310 315 Tyr Leu Leu Ala Ile Gly Ile Cys Asn Leu Leu LeuTyr Phe Ala 320 325 330 Phe Tyr Ile Ile Met Lys Leu Arg Ser Gly Glu ArgIle Lys Leu 335 340 345 Ile Pro Leu Leu Cys Ile Val Cys Thr Ser Val ValTrp Gly Phe 350 355 360 Ala Leu Phe Phe Phe Phe Gln Gly Leu Ser Thr TrpGln Lys Thr 365 370 375 Pro Ala Glu Ser Arg Glu His Asn Arg Asp Cys IleLeu Leu Asp 380 385 390 Phe Phe Asp Asp His Asp Ile Trp His Phe Leu SerSer Ile Ala 395 400 405 Met Phe Gly Ser Phe Leu Val Leu Leu Thr Leu AspAsp Asp Leu 410 415 420 Asp Thr Val Gln Arg Asp Lys Ile Tyr Val Phe 425430 376 amino acids amino acid single linear LNODNOT03 1572888 25 MetGly His Arg Phe Leu Arg Gly Leu Leu Thr Leu Leu Leu Pro 5 10 15 Pro ProPro Leu Tyr Thr Arg His Arg Met Leu Gly Pro Glu Ser 20 25 30 Val Pro ProPro Lys Arg Ser Arg Ser Lys Leu Met Ala Pro Pro 35 40 45 Arg Ile Gly ThrHis Asn Gly Thr Phe His Cys Asp Glu Ala Leu 50 55 60 Ala Cys Ala Leu LeuArg Leu Leu Pro Glu Tyr Arg Asp Ala Glu 65 70 75 Ile Val Arg Thr Arg AspPro Glu Lys Leu Ala Ser Cys Asp Ile 80 85 90 Val Val Asp Val Gly Gly GluTyr Asp Pro Arg Arg His Arg Tyr 95 100 105 Asp His His Gln Arg Ser PheThr Glu Thr Met Ser Ser Leu Ser 110 115 120 Pro Gly Lys Pro Trp Gln ThrLys Leu Ser Ser Ala Gly Leu Ile 125 130 135 Tyr Leu His Phe Gly His LysLeu Leu Ala Gln Leu Leu Gly Thr 140 145 150 Ser Glu Glu Asp Ser Met ValGly Thr Leu Tyr Asp Lys Met Tyr 155 160 165 Glu Asn Phe Val Glu Glu ValAsp Ala Val Asp Asn Gly Ile Ser 170 175 180 Gln Trp Ala Glu Gly Glu ProArg Tyr Ala Leu Thr Thr Thr Leu 185 190 195 Ser Ala Arg Val Ala Arg LeuAsn Pro Thr Trp Asn His Pro Asp 200 205 210 Gln Asp Thr Glu Ala Gly PheLys Arg Ala Met Asp Leu Val Gln 215 220 225 Glu Glu Phe Leu Gln Arg LeuAsp Phe Tyr Gln His Ser Trp Leu 230 235 240 Pro Ala Arg Ala Leu Val GluGlu Ala Leu Ala Gln Arg Phe Gln 245 250 255 Val Asp Pro Ser Gly Glu IleVal Glu Leu Ala Lys Gly Ala Cys 260 265 270 Pro Trp Lys Glu His Leu TyrHis Leu Glu Ser Gly Leu Ser Pro 275 280 285 Pro Val Ala Ile Phe Phe ValIle Tyr Thr Asp Gln Ala Gly Gln 290 295 300 Trp Arg Ile Gln Cys Val ProLys Glu Pro His Ser Phe Gln Ser 305 310 315 Arg Leu Pro Leu Pro Glu ProTrp Arg Gly Leu Arg Asp Glu Ala 320 325 330 Leu Asp Gln Val Ser Gly IlePro Gly Cys Ile Phe Val His Ala 335 340 345 Ser Gly Phe Ile Gly Gly HisArg Thr Arg Glu Gly Ala Leu Ser 350 355 360 Met Ala Arg Ala Thr Leu AlaGln Arg Ser Tyr Leu Pro Gln Ile 365 370 375 Ser 340 amino acids aminoacid single linear LNODNOT03 1573677 26 Met Arg Leu Arg Gly Leu Leu GlnGly Thr Leu Arg Phe His Thr 5 10 15 Ser Pro Pro Thr Asp Ser Ser Val ThrGlu Thr Ile Ile Leu Cys 20 25 30 Thr Met Leu Phe Leu Gly Ser Leu Gly AlaTrp Gly Thr Thr Ser 35 40 45 Ile Ser Thr Gly Ser Ile Phe Ser Leu Lys ThrLeu Arg Ser Gln 50 55 60 His Gly Gly Gln Val Gly Leu Lys Val Ser Arg ProArg Ala Gln 65 70 75 Pro Leu Pro Ala Gln Pro Pro Ala Leu Ala Gln Pro GlnTyr Gln 80 85 90 Ser Pro Gln Gln Pro Pro Gln Thr Arg Trp Val Ala Pro ArgAsn 95 100 105 Arg Asn Ala Ala Phe Gly Gln Ser Gly Gly Ala Gly Ser AspSer 110 115 120 Asn Ser Pro Gly Asn Val Gln Pro Asn Ser Ala Pro Ser ValGlu 125 130 135 Ser His Pro Val Leu Glu Lys Leu Lys Ala Ala His Ser TyrAsn 140 145 150 Pro Lys Glu Phe Glu Trp Asn Leu Lys Ser Gly Arg Val PheIle 155 160 165 Ile Lys Ser Tyr Ser Glu Asp Asp Ile His Arg Ser Ile LysTyr 170 175 180 Ser Ile Trp Cys Ser Thr Glu His Gly Asn Lys Arg Leu AspSer 185 190 195 Ala Phe Arg Cys Met Ser Ser Lys Gly Pro Val Tyr Leu LeuPhe 200 205 210 Ser Val Asn Gly Ser Gly His Phe Cys Gly Val Ala Glu MetLys 215 220 225 Ser Pro Val Asp Tyr Gly Thr Ser Ala Gly Val Trp Ser GlnAsp 230 235 240 Lys Trp Lys Gly Lys Phe Asp Val Gln Trp Ile Phe Val LysAsp 245 250 255 Val Pro Asn Asn Gln Leu Arg His Ile Arg Leu Glu Asn AsnAsp 260 265 270 Asn Lys Pro Val Thr Asn Ser Arg Asp Thr Gln Glu Val ProLeu 275 280 285 Glu Lys Ala Lys Gln Val Leu Lys Ile Ile Ser Ser Tyr LysHis 290 295 300 Thr Thr Ser Ile Phe Asp Asp Phe Ala His Tyr Glu Lys ArgGln 305 310 315 Arg Arg Arg Arg Trp Cys Ala Arg Asn Gly Arg Val Glu ThrAsn 320 325 330 Asn Glu Gly Glu Pro Val Ser Tyr Met Phe 335 340 174amino acids amino acid single linear LNODNOT03 1574624 27 Met Ala AspVal Leu Asp Leu His Glu Ala Gly Gly Glu Asp Phe 5 10 15 Ala Met Asp GluAsp Gly Asp Glu Ser Ile His Lys Leu Lys Glu 20 25 30 Lys Ala Lys Lys ArgLys Gly Arg Gly Phe Gly Ser Glu Glu Gly 35 40 45 Ser Arg Ala Arg Met ArgGlu Asp Tyr Asp Ser Val Glu Gln Asp 50 55 60 Gly Asp Glu Pro Gly Pro GlnArg Ser Val Glu Gly Trp Ile Leu 65 70 75 Phe Val Thr Gly Val His Glu GluAla Thr Glu Glu Asp Ile His 80 85 90 Asp Lys Phe Ala Glu Tyr Gly Glu IleLys Asn Ile His Leu Asn 95 100 105 Leu Asp Arg Arg Thr Gly Tyr Leu LysGly Tyr Thr Leu Val Glu 110 115 120 Tyr Glu Thr Tyr Lys Glu Ala Gln AlaAla Met Glu Gly Leu Asn 125 130 135 Gly Gln Asp Leu Met Gly Gln Pro IleSer Val Asp Trp Cys Phe 140 145 150 Val Arg Gly Pro Pro Lys Gly Lys ArgArg Gly Gly Arg Arg Arg 155 160 165 Ser Arg Ser Pro Asp Arg Arg Arg Arg170 179 amino acids amino acid single linear LNODNOT03 1577239 28 MetVal Gln Ala Trp Tyr Met Asp Asp Ala Pro Gly Asp Pro Arg 5 10 15 Gln ProHis Arg Pro Asp Pro Gly Arg Pro Val Gly Leu Glu Gln 20 25 30 Leu Arg ArgLeu Gly Val Leu Tyr Trp Lys Leu Asp Ala Asp Lys 35 40 45 Tyr Glu Asn AspPro Glu Leu Glu Lys Ile Arg Arg Glu Arg Asn 50 55 60 Tyr Ser Trp Met AspIle Ile Thr Ile Cys Lys Asp Lys Leu Pro 65 70 75 Asn Tyr Glu Glu Lys IleLys Met Phe Tyr Glu Glu His Leu His 80 85 90 Leu Asp Asp Glu Ile Arg TyrIle Leu Asp Gly Ser Gly Tyr Phe 95 100 105 Asp Val Arg Asp Lys Glu AspGln Trp Ile Arg Ile Phe Met Glu 110 115 120 Lys Gly Asp Met Val Thr LeuPro Ala Gly Ile Tyr His Arg Phe 125 130 135 Thr Val Asp Glu Lys Asn TyrThr Lys Ala Met Arg Leu Phe Val 140 145 150 Gly Glu Pro Val Trp Thr AlaTyr Asn Arg Pro Ala Asp His Phe 155 160 165 Glu Ala Arg Gly Gln Tyr ValLys Phe Leu Ala Gln Thr Ala 170 175 205 amino acids amino acid singlelinear BLADNOT03 1598203 29 Met Ala Ala Ala Arg Pro Ser Leu Gly Arg ValLeu Pro Gly Ser 5 10 15 Ser Val Leu Phe Leu Cys Asp Met Gln Glu Lys PheArg His Asn 20 25 30 Ile Ala Tyr Phe Pro Gln Ile Val Ser Val Ala Ala ArgMet Leu 35 40 45 Lys Val Ala Arg Leu Leu Glu Val Pro Val Met Leu Thr GluGln 50 55 60 Tyr Pro Gln Gly Leu Gly Pro Thr Val Pro Glu Leu Gly Thr Glu65 70 75 Gly Leu Arg Pro Leu Ala Lys Thr Cys Phe Ser Met Val Pro Ala 8085 90 Leu Gln Gln Glu Leu Asp Ser Arg Pro Gln Leu Arg Ser Val Leu 95 100105 Leu Cys Gly Ile Glu Ala Gln Ala Cys Ile Leu Asn Thr Thr Leu 110 115120 Asp Leu Leu Asp Arg Gly Leu Gln Val His Val Val Val Asp Ala 125 130135 Cys Ser Ser Arg Ser Gln Val Asp Arg Leu Val Ala Leu Ala Arg 140 145150 Met Arg Gln Ser Gly Ala Phe Leu Ser Thr Ser Glu Gly Leu Ile 155 160165 Leu Gln Leu Val Gly Asp Ala Val His Pro Gln Phe Lys Glu Ile 170 175180 Gln Lys Leu Ile Lys Glu Pro Ala Pro Asp Ser Gly Leu Leu Gly 185 190195 Leu Phe Gln Gly Gln Asn Ser Leu Leu His 200 205 419 amino acidsamino acid single linear BLADNOT03 1600438 30 Met Asn Lys His Gln LysPro Val Leu Thr Gly Gln Arg Phe Lys 5 10 15 Thr Arg Lys Arg Asp Glu LysGlu Lys Phe Glu Pro Thr Val Phe 20 25 30 Arg Asp Thr Leu Val Gln Gly LeuAsn Glu Ala Gly Asp Asp Leu 35 40 45 Glu Ala Val Ala Lys Phe Leu Asp SerThr Gly Ser Arg Leu Asp 50 55 60 Tyr Arg Arg Tyr Ala Asp Thr Leu Phe AspIle Leu Val Ala Gly 65 70 75 Ser Met Leu Ala Pro Gly Gly Thr Arg Ile AspAsp Gly Asp Lys 80 85 90 Thr Lys Met Thr Asn His Cys Val Phe Ser Ala AsnGlu Asp His 95 100 105 Glu Thr Ile Arg Asn Tyr Ala Gln Val Phe Asn LysLeu Ile Arg 110 115 120 Arg Tyr Lys Tyr Leu Glu Lys Ala Phe Glu Asp GluMet Lys Lys 125 130 135 Leu Leu Leu Phe Leu Lys Ala Phe Ser Glu Thr GluGln Thr Lys 140 145 150 Leu Ala Met Leu Ser Gly Ile Leu Leu Gly Asn GlyThr Leu Pro 155 160 165 Ala Thr Ile Leu Thr Ser Leu Phe Thr Asp Ser LeuVal Lys Glu 170 175 180 Gly Ile Ala Ala Ser Phe Ala Val Lys Leu Phe LysAla Trp Met 185 190 195 Ala Glu Lys Asp Ala Asn Ser Val Thr Ser Ser LeuArg Lys Ala 200 205 210 Asn Leu Asp Lys Arg Leu Leu Glu Leu Phe Pro ValAsn Arg Gln 215 220 225 Ser Val Asp His Phe Ala Lys Tyr Phe Thr Asp AlaGly Leu Lys 230 235 240 Glu Leu Ser Asp Phe Leu Arg Val Gln Gln Ser LeuGly Thr Arg 245 250 255 Lys Glu Leu Gln Lys Glu Leu Gln Glu Arg Leu SerGln Glu Cys 260 265 270 Pro Ile Lys Glu Val Val Leu Tyr Val Lys Glu GluMet Lys Arg 275 280 285 Asn Asp Leu Pro Glu Thr Ala Val Ile Gly Leu LeuTrp Thr Cys 290 295 300 Ile Met Asn Ala Val Glu Trp Asn Lys Lys Glu GluLeu Val Ala 305 310 315 Glu Gln Ala Leu Lys His Leu Lys Gln Tyr Ala ProLeu Leu Ala 320 325 330 Val Phe Ser Ser Gln Gly Gln Ser Glu Leu Ile LeuLeu Gln Lys 335 340 345 Val Gln Glu Tyr Cys Tyr Asp Asn Ile His Phe MetLys Ala Phe 350 355 360 Gln Lys Ile Val Val Leu Phe Tyr Lys Ala Asp ValLeu Ser Glu 365 370 375 Glu Ala Ile Leu Lys Trp Tyr Lys Glu Ala His ValAla Lys Gly 380 385 390 Lys Ser Val Phe Leu Asp Gln Met Lys Lys Phe ValGlu Trp Leu 395 400 405 Gln Asn Ala Glu Glu Glu Ser Glu Ser Glu Gly GluGlu Asn 410 415 376 amino acids amino acid single linear BLADNOT031600518 31 Met Lys Asp Val Pro Gly Phe Leu Gln Gln Ser Gln Ser Ser Gly 510 15 Pro Gly Gln Pro Ala Val Trp His Arg Leu Glu Glu Leu Tyr Thr 20 2530 Lys Lys Leu Trp His Gln Leu Thr Leu Gln Val Leu Asp Phe Val 35 40 45Gln Asp Pro Cys Phe Ala Gln Gly Asp Gly Leu Ile Lys Leu Tyr 50 55 60 GluAsn Phe Ile Ser Glu Phe Glu His Arg Val Asn Pro Leu Ser 65 70 75 Leu ValGlu Ile Ile Leu His Val Val Arg Gln Met Thr Asp Pro 80 85 90 Asn Val AlaLeu Thr Phe Leu Glu Lys Thr Arg Glu Lys Val Lys 95 100 105 Ser Ser AspGlu Ala Val Ile Leu Cys Lys Thr Ala Ile Gly Ala 110 115 120 Leu Lys LeuAsn Ile Gly Asp Leu Gln Val Thr Lys Glu Thr Ile 125 130 135 Glu Asp ValGlu Glu Met Leu Asn Asn Leu Pro Gly Val Thr Ser 140 145 150 Val His SerArg Phe Tyr Asp Leu Ser Ser Lys Tyr Tyr Gln Thr 155 160 165 Ile Gly AsnHis Ala Ser Tyr Tyr Lys Asp Ala Leu Arg Phe Leu 170 175 180 Gly Cys ValAsp Ile Lys Asp Leu Pro Val Ser Glu Gln Gln Glu 185 190 195 Arg Ala PheThr Leu Gly Leu Ala Gly Leu Leu Gly Glu Gly Val 200 205 210 Phe Asn PheGly Glu Leu Leu Met His Pro Val Leu Glu Ser Leu 215 220 225 Arg Asn ThrAsp Arg Gln Trp Leu Ile Asp Thr Leu Tyr Ala Phe 230 235 240 Asn Ser GlyAsn Val Glu Arg Phe Gln Thr Leu Lys Thr Ala Trp 245 250 255 Gly Gln GlnPro Asp Leu Ala Ala Asn Glu Ala Gln Leu Leu Arg 260 265 270 Lys Ile GlnLeu Leu Cys Leu Met Glu Met Thr Phe Thr Arg Pro 275 280 285 Ala Asn HisArg Gln Leu Thr Phe Glu Glu Ile Ala Lys Ser Ala 290 295 300 Lys Ile ThrVal Asn Glu Val Glu Leu Leu Val Met Lys Ala Leu 305 310 315 Ser Val GlyLeu Val Lys Gly Ser Ile Asp Glu Val Asp Lys Arg 320 325 330 Val His MetThr Trp Val Gln Pro Arg Val Leu Asp Leu Gln Gln 335 340 345 Ile Lys GlyMet Lys Asp Arg Leu Glu Phe Trp Cys Thr Asp Val 350 355 360 Lys Ser MetGlu Met Leu Val Glu His Gln Ala His Asp Ile Leu 365 370 375 Thr 237amino acids amino acid single linear BLADNOT03 1602473 32 Met Leu GlyGly Ser Leu Gly Ser Arg Leu Leu Arg Gly Val Gly 5 10 15 Gly Ser His GlyArg Phe Gly Ala Arg Gly Val Arg Glu Gly Gly 20 25 30 Ala Ala Met Ala AlaGly Glu Ser Met Ala Gln Arg Met Val Trp 35 40 45 Val Asp Leu Glu Met ThrGly Leu Asp Ile Glu Lys Asp Gln Ile 50 55 60 Ile Glu Met Ala Cys Leu IleThr Asp Ser Asp Leu Asn Ile Leu 65 70 75 Ala Glu Gly Pro Asn Leu Ile IleLys Gln Pro Asp Glu Leu Leu 80 85 90 Asp Ser Met Ser Asp Trp Cys Lys GluHis His Gly Lys Ser Gly 95 100 105 Leu Thr Lys Ala Val Lys Glu Ser ThrIle Thr Leu Gln Gln Ala 110 115 120 Glu Tyr Glu Phe Leu Ser Phe Val ArgGln Gln Thr Pro Pro Gly 125 130 135 Leu Cys Pro Leu Ala Gly Asn Ser ValHis Glu Asp Lys Lys Phe 140 145 150 Leu Asp Lys Tyr Met Pro Gln Phe MetLys His Leu His Tyr Arg 155 160 165 Ile Ile Asp Val Ser Thr Val Lys GluLeu Cys Arg Arg Trp Tyr 170 175 180 Pro Glu Glu Tyr Glu Phe Ala Pro LysLys Ala Ala Ser His Arg 185 190 195 Ala Leu Asp Asp Ile Ser Glu Ser IleLys Glu Leu Gln Phe Tyr 200 205 210 Arg Asn Asn Ile Phe Lys Lys Lys IleAsp Glu Lys Lys Arg Lys 215 220 225 Ile Ile Glu Asn Gly Glu Asn Glu LysThr Val Ser 230 235 152 amino acids amino acid single linear LUNGNOT151605720 33 Met Glu Ala Val Leu Asn Glu Leu Val Ser Val Glu Asp Leu Leu 510 15 Lys Phe Glu Lys Lys Phe Gln Ser Glu Lys Ala Ala Gly Ser Val 20 2530 Ser Lys Ser Thr Gln Phe Glu Tyr Ala Trp Cys Leu Val Arg Ser 35 40 45Lys Tyr Asn Asp Asp Ile Arg Lys Gly Ile Val Leu Leu Glu Glu 50 55 60 LeuLeu Pro Lys Gly Ser Lys Glu Glu Gln Arg Asp Tyr Val Phe 65 70 75 Tyr LeuAla Val Gly Asn Tyr Arg Leu Lys Glu Tyr Glu Lys Ala 80 85 90 Leu Lys TyrVal Arg Gly Leu Leu Gln Thr Glu Pro Gln Asn Asn 95 100 105 Gln Ala LysGlu Leu Glu Arg Leu Ile Asp Lys Ala Met Lys Lys 110 115 120 Asp Gly LeuVal Gly Met Ala Ile Val Gly Gly Met Ala Leu Gly 125 130 135 Val Ala GlyLeu Ala Gly Leu Ile Gly Leu Ala Val Ser Lys Ser 140 145 150 Lys Phe 179amino acids amino acid single linear COLNTUT06 1610501 34 Met Pro SerLys Ser Leu Val Met Glu Tyr Leu Ala His Pro Ser 5 10 15 Thr Leu Gly LeuAla Val Gly Val Ala Cys Gly Met Cys Leu Gly 20 25 30 Trp Ser Leu Arg ValCys Phe Gly Met Leu Pro Lys Ser Lys Thr 35 40 45 Ser Lys Thr His Thr AspThr Glu Ser Glu Ala Ser Ile Leu Gly 50 55 60 Asp Ser Gly Glu Tyr Lys MetIle Leu Val Val Arg Asn Asp Leu 65 70 75 Lys Met Gly Lys Gly Lys Val AlaAla Gln Cys Ser His Ala Ala 80 85 90 Val Ser Ala Tyr Lys Gln Ile Gln ArgArg Asn Pro Glu Met Leu 95 100 105 Lys Gln Trp Glu Tyr Cys Gly Gln ProLys Val Val Val Lys Ala 110 115 120 Pro Asp Glu Glu Thr Leu Ile Ala LeuLeu Ala His Ala Lys Met 125 130 135 Leu Gly Leu Thr Val Ser Leu Ile GlnAsp Ala Gly Arg Thr Gln 140 145 150 Ile Ala Pro Gly Ser Gln Thr Val LeuGly Ile Gly Pro Gly Pro 155 160 165 Ala Asp Leu Ile Asp Lys Val Thr GlyHis Leu Lys Leu Tyr 170 175 196 amino acids amino acid single linearBLADNOT06 1720770 35 Met Ser Glu Gly Asp Ser Val Gly Glu Ser Val His GlyLys Pro 5 10 15 Ser Val Val Tyr Arg Phe Phe Thr Arg Leu Gly Gln Ile TyrGln 20 25 30 Ser Trp Leu Asp Lys Ser Thr Pro Tyr Thr Ala Val Arg Trp Val35 40 45 Val Thr Leu Gly Leu Ser Phe Val Tyr Met Ile Arg Val Tyr Leu 5055 60 Leu Gln Gly Trp Tyr Ile Val Thr Tyr Ala Leu Gly Ile Tyr His 65 7075 Leu Asn Leu Phe Ile Ala Phe Leu Ser Pro Lys Val Asp Pro Ser 80 85 90Leu Met Glu Asp Ser Asp Asp Gly Pro Ser Leu Pro Thr Lys Gln 95 100 105Asn Glu Glu Phe Arg Pro Phe Ile Arg Arg Leu Pro Glu Phe Lys 110 115 120Phe Trp His Ala Ala Thr Lys Gly Ile Leu Val Ala Met Val Cys 125 130 135Thr Phe Phe Asp Ala Phe Asn Val Pro Val Phe Trp Pro Ile Leu 140 145 150Val Met Tyr Phe Ile Met Leu Phe Cys Ile Thr Met Lys Arg Gln 155 160 165Ile Lys His Met Ile Lys Tyr Arg Tyr Ile Pro Phe Thr His Gly 170 175 180Lys Arg Arg Tyr Arg Gly Lys Glu Asp Ala Gly Lys Ala Phe Ala 185 190 195Ser 612 amino acids amino acid single linear BRAINON01 1832295 36 MetAla Ala Ala Gly Arg Leu Pro Ser Ser Trp Ala Leu Phe Ser 5 10 15 Pro LeuLeu Ala Gly Leu Ala Leu Leu Gly Val Gly Pro Val Pro 20 25 30 Ala Arg AlaLeu His Asn Val Thr Ala Glu Leu Phe Gly Ala Glu 35 40 45 Ala Trp Gly ThrLeu Ala Ala Phe Gly Asp Leu Asn Ser Asp Lys 50 55 60 Gln Thr Asp Leu PheVal Leu Arg Glu Arg Asn Asp Leu Ile Val 65 70 75 Phe Leu Ala Asp Gln AsnAla Pro Tyr Phe Lys Pro Lys Val Lys 80 85 90 Val Ser Phe Lys Asn His SerAla Leu Ile Thr Ser Val Val Pro 95 100 105 Gly Asp Tyr Asp Gly Asp SerGln Met Asp Val Leu Leu Thr Tyr 110 115 120 Leu Pro Lys Asn Tyr Ala LysSer Glu Leu Gly Ala Val Ile Phe 125 130 135 Trp Gly Gln Asn Gln Thr LeuAsp Pro Asn Asn Met Thr Ile Leu 140 145 150 Asn Arg Thr Phe Gln Asp GluPro Leu Ile Met Asp Phe Asn Gly 155 160 165 Asp Leu Ile Pro Asp Ile PheGly Ile Thr Asn Glu Ser Asn Gln 170 175 180 Pro Gln Ile Leu Leu Gly GlyAsn Leu Ser Trp His Pro Ala Leu 185 190 195 Thr Thr Thr Ser Lys Met ArgIle Pro His Ser His Ala Phe Ile 200 205 210 Asp Leu Thr Glu Asp Phe ThrAla Asp Leu Phe Leu Thr Thr Leu 215 220 225 Asn Ala Thr Thr Ser Thr PheGln Phe Glu Ile Trp Glu Asn Leu 230 235 240 Asp Gly Asn Phe Ser Val SerThr Ile Leu Glu Lys Pro Gln Asn 245 250 255 Met Met Val Val Gly Gln SerAla Phe Ala Asp Phe Asp Gly Asp 260 265 270 Gly His Met Asp His Leu LeuPro Gly Cys Glu Asp Lys Asn Cys 275 280 285 Gln Lys Ser Thr Ile Tyr LeuVal Arg Ser Gly Met Lys Gln Trp 290 295 300 Val Pro Val Leu Gln Asp PheSer Asn Lys Gly Thr Leu Trp Gly 305 310 315 Phe Val Pro Phe Val Asp GluGln Gln Pro Thr Glu Ile Pro Ile 320 325 330 Pro Ile Thr Leu His Ile GlyAsp Tyr Asn Met Asp Gly Tyr Pro 335 340 345 Asp Ala Leu Val Ile Leu LysAsn Thr Ser Gly Ser Asn Gln Gln 350 355 360 Ala Phe Leu Leu Glu Asn ValPro Cys Asn Asn Ala Ser Cys Glu 365 370 375 Glu Ala Arg Arg Met Phe LysVal Tyr Trp Glu Leu Thr Asp Leu 380 385 390 Asn Gln Ile Lys Asp Ala MetVal Ala Thr Phe Phe Asp Ile Tyr 395 400 405 Glu Asp Gly Ile Leu Asp IleVal Val Leu Ser Lys Gly Tyr Thr 410 415 420 Lys Asn Asp Phe Ala Ile HisThr Leu Lys Asn Asn Phe Glu Ala 425 430 435 Asp Ala Tyr Phe Val Lys ValIle Val Leu Ser Gly Leu Cys Ser 440 445 450 Asn Asp Cys Pro Arg Lys IleThr Pro Phe Gly Val Asn Gln Pro 455 460 465 Gly Pro Tyr Ile Met Tyr ThrThr Leu Asp Ala Asn Gly Tyr Leu 470 475 480 Lys Asn Gly Ser Ala Gly GlnLeu Ser Gln Ser Ala His Leu Ala 485 490 495 Leu Gln Leu Pro Tyr Asn ValLeu Gly Leu Gly Arg Ser Ala Asn 500 505 510 Phe Leu Asp His Leu Tyr ValGly Ile Pro Arg Pro Ser Gly Glu 515 520 525 Lys Ser Ile Arg Lys Gln GluTrp Thr Ala Ile Ile Pro Asn Ser 530 535 540 Gln Leu Ile Val Ile Pro TyrPro His Asn Val Pro Arg Ser Trp 545 550 555 Ser Ala Lys Leu Tyr Leu ThrPro Ser Asn Ile Val Leu Leu Thr 560 565 570 Ala Ile Ala Leu Ile Gly ValCys Val Phe Ile Leu Ala Ile Ile 575 580 585 Gly Ile Leu His Trp Gln GluLys Lys Ala Asp Asp Arg Glu Lys 590 595 600 Arg Gln Glu Ala His Arg PheHis Phe Asp Ala Met 605 610 101 amino acids amino acid single linearCORPNOT02 1990522 37 Met Ala Ala Pro Leu Ser Val Glu Val Glu Phe Gly GlyGly Ala 5 10 15 Glu Leu Leu Phe Asp Gly Ile Lys Lys His Arg Val Thr LeuPro 20 25 30 Gly Gln Glu Glu Pro Trp Asp Ile Arg Asn Leu Leu Ile Trp Ile35 40 45 Lys Lys Asn Leu Leu Lys Glu Arg Pro Glu Leu Phe Ile Gln Gly 5055 60 Asp Ser Val Arg Pro Gly Ile Leu Val Leu Ile Asn Asp Ala Asp 65 7075 Trp Glu Leu Leu Gly Glu Leu Asp Tyr Gln Leu Gln Asp Gln Asp 80 85 90Ser Val Leu Phe Ile Ser Thr Leu His Gly Gly 95 100 132 amino acids aminoacid single linear BRAITUT02 2098087 38 Met Ala Lys Asp Ile Leu Gly GluAla Gly Leu His Phe Asp Glu 5 10 15 Leu Asn Lys Leu Arg Val Leu Asp ProGlu Val Thr Gln Gln Thr 20 25 30 Ile Glu Leu Lys Glu Glu Cys Lys Asp PheVal Asp Lys Ile Gly 35 40 45 Gln Phe Gln Lys Ile Val Gly Gly Leu Ile GluLeu Val Asp Gln 50 55 60 Leu Ala Lys Glu Ala Glu Asn Glu Lys Met Lys AlaIle Gly Ala 65 70 75 Arg Asn Leu Leu Lys Ser Ile Ala Lys Gln Arg Glu AlaGln Gln 80 85 90 Gln Gln Leu Gln Ala Leu Ile Ala Glu Lys Lys Met Gln LeuGlu 95 100 105 Arg Tyr Arg Val Glu Tyr Glu Ala Leu Cys Lys Val Glu AlaGlu 110 115 120 Gln Asn Glu Phe Ile Asp Gln Phe Ile Phe Gln Lys 125 130188 amino acids amino acid single linear BRAITUT03 2112230 39 Met AlaAsn Ser Gly Cys Lys Asp Val Thr Gly Pro Asp Glu Glu 5 10 15 Ser Phe LeuTyr Phe Ala Tyr Gly Ser Asn Leu Leu Thr Glu Arg 20 25 30 Ile His Leu ArgAsn Pro Ser Ala Ala Phe Phe Cys Val Ala Arg 35 40 45 Leu Gln Asp Phe LysLeu Asp Phe Gly Asn Ser Gln Gly Lys Thr 50 55 60 Ser Gln Thr Trp His GlyGly Ile Ala Thr Ile Phe Gln Ser Pro 65 70 75 Gly Asp Glu Val Trp Gly ValVal Trp Lys Met Asn Lys Ser Asn 80 85 90 Leu Asn Ser Leu Asp Glu Gln GluGly Val Lys Ser Gly Met Tyr 95 100 105 Val Val Ile Glu Val Lys Val AlaThr Gln Glu Gly Lys Glu Ile 110 115 120 Thr Cys Arg Ser Tyr Leu Met ThrAsn Tyr Glu Ser Ala Pro Pro 125 130 135 Ser Pro Gln Tyr Lys Lys Ile IleCys Met Gly Ala Lys Glu Asn 140 145 150 Gly Leu Pro Leu Glu Tyr Gln GluLys Leu Lys Ala Ile Glu Pro 155 160 165 Asn Asp Tyr Thr Gly Lys Val SerGlu Glu Ile Glu Asp Ile Ile 170 175 180 Lys Lys Gly Glu Thr Gln Thr Leu185 86 amino acids amino acid single linear BRSTTUT02 2117050 40 Met ThrAsp Arg Tyr Thr Ile His Ser Gln Leu Glu His Leu Gln 5 10 15 Ser Lys TyrIle Gly Thr Gly His Ala Asp Thr Thr Lys Trp Glu 20 25 30 Trp Leu Val AsnGln His Arg Asp Ser Tyr Cys Ser Tyr Met Gly 35 40 45 His Phe Asp Leu LeuAsn Tyr Phe Ala Ile Ala Glu Asn Glu Ser 50 55 60 Lys Ala Arg Val Arg PheAsn Leu Met Glu Lys Met Leu Gln Pro 65 70 75 Cys Gly Pro Pro Ala Asp LysPro Glu Glu Asn 80 85 222 amino acids amino acid single linear SININOT012184712 41 Met Ser Gly Leu Gly Arg Leu Phe Gly Lys Gly Lys Lys Glu Lys 510 15 Gly Pro Thr Pro Glu Glu Ala Ile Gln Lys Leu Lys Glu Thr Glu 20 2530 Lys Ile Leu Ile Lys Lys Gln Glu Phe Leu Glu Gln Lys Ile Gln 35 40 45Gln Glu Leu Gln Thr Ala Lys Lys Tyr Gly Thr Lys Asn Lys Arg 50 55 60 AlaAla Leu Gln Ala Leu Arg Arg Lys Lys Arg Phe Glu Gln Gln 65 70 75 Leu AlaGln Thr Asp Gly Thr Leu Ser Thr Leu Glu Phe Gln Arg 80 85 90 Glu Ala IleGlu Asn Ala Thr Thr Asn Ala Glu Val Leu Arg Thr 95 100 105 Met Glu LeuAla Ala Gln Ser Met Lys Lys Ala Tyr Gln Asp Met 110 115 120 Asp Ile AspLys Val Asp Glu Leu Met Thr Asp Ile Thr Glu Gln 125 130 135 Gln Glu ValAla Gln Gln Ile Ser Asp Ala Ile Ser Arg Pro Met 140 145 150 Gly Phe GlyAsp Asp Val Asp Glu Asp Glu Leu Leu Glu Glu Leu 155 160 165 Glu Glu LeuGlu Gln Glu Glu Leu Ala Gln Glu Leu Leu Asn Val 170 175 180 Gly Asp LysGlu Glu Glu Pro Ser Val Lys Leu Pro Ser Val Pro 185 190 195 Ser Thr HisLeu Pro Ala Gly Pro Ala Pro Lys Val Asp Glu Asp 200 205 210 Glu Glu AlaLeu Lys Gln Leu Ala Glu Trp Val Ser 215 220 300 amino acids amino acidsingle linear BRAINON01 2290475 42 Met Ser Gly Ser Asn Gly Ser Lys GluAsn Ser His Asn Lys Ala 5 10 15 Arg Thr Ser Pro Tyr Pro Gly Ser Lys ValGlu Arg Ser Gln Val 20 25 30 Pro Asn Glu Lys Val Gly Trp Leu Val Glu TrpGln Asp Tyr Lys 35 40 45 Pro Val Glu Tyr Thr Ala Val Ser Val Leu Ala GlyPro Arg Trp 50 55 60 Ala Asp Pro Gln Ile Ser Glu Ser Asn Phe Ser Pro LysPhe Asn 65 70 75 Glu Lys Asp Gly His Val Glu Arg Lys Ser Lys Asn Gly LeuTyr 80 85 90 Glu Ile Glu Asn Gly Arg Pro Arg Asn Pro Ala Gly Arg Thr Gly95 100 105 Leu Val Gly Arg Gly Leu Leu Gly Arg Trp Gly Pro Asn His Ala110 115 120 Ala Asp Pro Ile Ile Thr Arg Trp Lys Arg Asp Ser Ser Gly Asn125 130 135 Lys Ile Met His Pro Val Ser Gly Lys His Ile Leu Gln Phe Val140 145 150 Ala Ile Lys Arg Lys Asp Cys Gly Glu Trp Ala Ile Pro Gly Gly155 160 165 Met Val Asp Pro Gly Glu Lys Ile Ser Ala Thr Leu Lys Arg Glu170 175 180 Phe Gly Glu Glu Ala Leu Asn Ser Leu Gln Lys Thr Ser Ala Glu185 190 195 Lys Arg Glu Ile Glu Glu Lys Leu His Lys Leu Phe Ser Gln Asp200 205 210 His Leu Val Ile Tyr Lys Gly Tyr Val Asp Asp Pro Arg Asn Thr215 220 225 Asp Asn Ala Trp Met Glu Thr Glu Ala Val Asn Tyr His Asp Glu230 235 240 Thr Gly Glu Ile Met Asp Asn Leu Met Leu Glu Ala Gly Asp Asp245 250 255 Ala Gly Lys Val Lys Trp Val Asp Ile Asn Asp Lys Leu Lys Leu260 265 270 Tyr Ala Ser His Ser Gln Phe Ile Lys Leu Val Ala Glu Lys Arg275 280 285 Asp Ala His Trp Ser Glu Asp Ser Glu Ala Asp Cys His Ala Leu290 295 300 112 amino acids amino acid single linear LUNGNOT20 235345243 Met Glu Ala Tyr Glu Gln Val Gln Lys Gly Pro Leu Lys Leu Lys 5 10 15Gly Val Ala Glu Leu Gly Val Thr Lys Arg Lys Lys Lys Lys Lys 20 25 30 AspLys Asp Lys Ala Lys Leu Leu Glu Ala Met Gly Thr Ser Lys 35 40 45 Lys AsnGlu Glu Glu Lys Arg Arg Gly Leu Asp Lys Arg Thr Pro 50 55 60 Ala Gln AlaAla Phe Glu Lys Met Gln Glu Lys Arg Gln Met Glu 65 70 75 Arg Ile Leu LysLys Ala Ser Lys Thr His Lys Gln Arg Val Glu 80 85 90 Asp Phe Asn Arg HisLeu Asp Thr Leu Thr Glu His Tyr Asp Ile 95 100 105 Pro Lys Val Ser TrpThr Lys 110 251 amino acids amino acid single linear THP1NOT03 246961144 Met Ser Asp Ile Gly Asp Trp Phe Arg Ser Ile Pro Ala Ile Thr 5 10 15Arg Tyr Trp Phe Ala Ala Thr Val Ala Val Pro Leu Val Gly Lys 20 25 30 LeuGly Leu Ile Ser Pro Ala Tyr Leu Phe Leu Trp Pro Glu Ala 35 40 45 Phe LeuTyr Arg Phe Gln Ile Trp Arg Pro Ile Thr Ala Thr Phe 50 55 60 Tyr Phe ProVal Gly Pro Gly Thr Gly Phe Leu Tyr Leu Val Asn 65 70 75 Leu Tyr Phe LeuTyr Gln Tyr Ser Thr Arg Leu Glu Thr Gly Ala 80 85 90 Phe Asp Gly Arg ProAla Asp Tyr Leu Phe Met Leu Leu Phe Asn 95 100 105 Trp Ile Cys Ile ValIle Thr Gly Leu Ala Met Asp Met Gln Leu 110 115 120 Leu Met Ile Pro LeuIle Met Ser Val Leu Tyr Val Trp Ala Gln 125 130 135 Leu Asn Arg Asp MetIle Val Ser Phe Trp Phe Gly Thr Arg Phe 140 145 150 Lys Ala Cys Tyr LeuPro Trp Val Ile Leu Gly Phe Asn Tyr Ile 155 160 165 Ile Gly Gly Ser ValIle Asn Glu Leu Ile Gly Asn Leu Val Gly 170 175 180 His Leu Tyr Phe PheLeu Met Phe Arg Tyr Pro Met Asp Leu Gly 185 190 195 Gly Arg Asn Phe LeuSer Thr Pro Gln Phe Leu Tyr Arg Trp Leu 200 205 210 Pro Ser Arg Arg GlyGly Val Ser Gly Phe Gly Val Pro Pro Ala 215 220 225 Ser Met Arg Arg AlaAla Asp Gln Asn Gly Gly Gly Gly Arg His 230 235 240 Asn Trp Gly Gln GlyPhe Arg Leu Gly Asp Gln 245 250 811 amino acids amino acid single linearLIVRTUT04 2515476 45 Met Pro Leu Ser Ser Pro Asn Ala Ala Ala Thr Ala SerAsp Met 5 10 15 Asp Lys Asn Ser Gly Ser Asn Ser Ser Ser Ala Ser Ser GlySer 20 25 30 Ser Lys Gly Gln Gln Pro Pro Arg Ser Ala Ser Ala Gly Pro Ala35 40 45 Gly Glu Ser Lys Pro Lys Ser Asp Gly Lys Asn Ser Ser Gly Ser 5055 60 Lys Arg Tyr Asn Arg Lys Arg Glu Leu Ser Tyr Pro Lys Asn Glu 65 7075 Ser Phe Asn Asn Gln Ser Arg Arg Ser Ser Ser Gln Lys Ser Lys 80 85 90Thr Phe Asn Lys Met Pro Pro Gln Arg Gly Gly Gly Ser Ser Lys 95 100 105Leu Phe Ser Ser Ser Phe Asn Gly Gly Arg Arg Asp Glu Val Ala 110 115 120Glu Ala Gln Arg Ala Glu Phe Ser Pro Ala Gln Phe Ser Gly Pro 125 130 135Lys Lys Ile Asn Leu Asn His Leu Leu Asn Phe Thr Phe Glu Pro 140 145 150Arg Gly Gln Thr Gly His Phe Glu Gly Ser Gly His Gly Ser Trp 155 160 165Gly Lys Arg Asn Lys Trp Gly His Lys Pro Phe Asn Lys Glu Leu 170 175 180Phe Leu Gln Ala Asn Cys Gln Phe Val Val Ser Glu Asp Gln Asp 185 190 195Tyr Thr Ala His Phe Ala Asp Pro Asp Thr Leu Val Asn Trp Asp 200 205 210Phe Val Glu Gln Val Arg Ile Cys Ser His Glu Val Pro Ser Cys 215 220 225Pro Ile Cys Leu Tyr Pro Pro Thr Ala Ala Lys Ile Thr Arg Cys 230 235 240Gly His Ile Phe Cys Trp Ala Cys Ile Leu His Tyr Leu Ser Leu 245 250 255Ser Glu Lys Thr Trp Ser Lys Cys Pro Ile Cys Tyr Ser Ser Val 260 265 270His Lys Lys Asp Leu Lys Ser Val Val Ala Thr Glu Ser His Gln 275 280 285Tyr Val Val Gly Asp Thr Ile Thr Met Gln Leu Met Lys Arg Glu 290 295 300Lys Gly Val Leu Val Ala Leu Pro Lys Ser Lys Trp Met Asn Val 305 310 315Asp His Pro Ile His Leu Gly Asp Glu Gln His Ser Gln Tyr Ser 320 325 330Lys Leu Leu Leu Ala Ser Lys Glu Gln Val Leu His Arg Val Val 335 340 345Leu Glu Glu Lys Val Ala Leu Glu Gln Gln Leu Ala Glu Glu Lys 350 355 360His Thr Pro Glu Ser Cys Phe Ile Glu Ala Ala Ile Gln Glu Leu 365 370 375Lys Thr Arg Glu Glu Ala Leu Ser Gly Leu Ala Gly Ser Arg Arg 380 385 390Glu Val Thr Gly Val Val Ala Ala Leu Glu Gln Leu Val Leu Met 395 400 405Ala Pro Leu Ala Lys Glu Ser Val Phe Gln Pro Arg Lys Gly Val 410 415 420Leu Glu Tyr Leu Ser Ala Phe Asp Glu Glu Thr Thr Glu Val Cys 425 430 435Ser Leu Asp Thr Pro Ser Arg Pro Leu Ala Leu Pro Leu Val Glu 440 445 450Glu Glu Glu Ala Val Ser Glu Pro Glu Pro Glu Gly Leu Pro Glu 455 460 465Ala Cys Asp Asp Leu Glu Leu Ala Asp Asp Asn Leu Lys Glu Gly 470 475 480Thr Ile Cys Thr Glu Ser Ser Gln Gln Glu Pro Ile Thr Lys Ser 485 490 495Gly Phe Thr Arg Leu Ser Ser Ser Pro Cys Tyr Tyr Phe Tyr Gln 500 505 510Ala Glu Asp Gly Gln His Met Phe Leu His Pro Val Asn Val Arg 515 520 525Cys Leu Val Arg Glu Tyr Gly Ser Leu Glu Arg Ser Pro Glu Lys 530 535 540Ile Ser Ala Thr Val Val Glu Ile Ala Gly Tyr Ser Met Ser Glu 545 550 555Asp Val Arg Gln Arg His Arg Tyr Leu Ser His Leu Pro Leu Thr 560 565 570Cys Glu Phe Ser Ile Cys Glu Leu Ala Leu Gln Pro Pro Val Val 575 580 585Ser Lys Glu Thr Leu Glu Met Phe Ser Asp Asp Ile Glu Lys Arg 590 595 600Lys Arg Gln Arg Gln Lys Lys Ala Arg Glu Glu Arg Arg Arg Glu 605 610 615Arg Arg Ile Glu Ile Glu Glu Asn Lys Lys Gln Gly Lys Tyr Pro 620 625 630Glu Val His Ile Pro Leu Glu Asn Leu Gln Gln Phe Pro Ala Phe 635 640 645Asn Ser Tyr Thr Cys Ser Ser Asp Ser Ala Leu Gly Pro Thr Ser 650 655 660Thr Glu Gly His Gly Ala Leu Ser Ile Ser Pro Leu Ser Arg Ser 665 670 675Pro Gly Ser His Ala Asp Phe Leu Leu Thr Pro Leu Ser Pro Thr 680 685 690Ala Ser Gln Gly Ser Pro Ser Phe Cys Val Gly Ser Leu Glu Glu 695 700 705Asp Ser Pro Phe Pro Ser Phe Ala Gln Met Leu Arg Val Gly Lys 710 715 720Ala Lys Ala Asp Val Trp Pro Lys Thr Ala Pro Lys Lys Asp Glu 725 730 735Asn Ser Leu Val Pro Pro Ala Pro Val Asp Ser Asp Gly Glu Ser 740 745 750Asp Asn Ser Asp Arg Val Pro Val Pro Ser Phe Gln Asn Ser Phe 755 760 765Ser Gln Ala Ile Glu Ala Ala Phe Met Lys Leu Asp Thr Pro Ala 770 775 780Thr Ser Asp Pro Leu Ser Glu Glu Lys Gly Gly Lys Lys Arg Lys 785 790 795Lys Gln Lys Gln Lys Leu Leu Phe Ser Thr Ser Val Val His Thr 800 805 810Lys 352 amino acids amino acid single linear THP1AZS08 2754573 46 MetHis Val Val Ala Pro Ala Ser Leu Arg Leu Gly Thr Gly Thr 5 10 15 Asn LeuPro Pro Ser Pro Thr Cys Leu Thr Lys Leu Ala Leu Pro 20 25 30 Pro Ala AlaGlu Pro Ser Leu Leu Ala Met Ser Gln Ser Arg His 35 40 45 Arg Ala Glu AlaPro Pro Leu Glu Arg Glu Asp Ser Gly Thr Phe 50 55 60 Ser Leu Gly Lys MetIle Thr Ala Lys Pro Gly Lys Thr Pro Ile 65 70 75 Gln Val Leu His Glu TyrGly Met Lys Thr Lys Asn Ile Pro Val 80 85 90 Tyr Glu Cys Glu Arg Ser AspVal Gln Ile His Val Pro Thr Phe 95 100 105 Thr Phe Arg Val Thr Val GlyAsp Ile Thr Cys Thr Gly Glu Gly 110 115 120 Thr Ser Lys Lys Leu Ala LysHis Arg Ala Ala Glu Ala Ala Ile 125 130 135 Asn Ile Leu Lys Ala Asn AlaSer Ile Cys Phe Ala Val Pro Asp 140 145 150 Pro Leu Met Pro Asp Pro SerLys Gln Pro Lys Asn Gln Leu Asn 155 160 165 Pro Ile Gly Ser Leu Gln GluLeu Ala Ile His His Gly Trp Arg 170 175 180 Leu Pro Glu Tyr Thr Leu SerGln Glu Gly Gly Pro Ala His Lys 185 190 195 Arg Glu Tyr Thr Thr Ile CysArg Leu Glu Ser Phe Met Glu Thr 200 205 210 Gly Lys Gly Ala Ser Lys LysGln Ala Lys Arg Asn Ala Ala Glu 215 220 225 Lys Phe Leu Ala Lys Phe SerAsn Ile Ser Pro Glu Asn His Ile 230 235 240 Ser Leu Thr Asn Val Val GlyHis Ser Leu Gly Cys Thr Trp His 245 250 255 Ser Leu Arg Asn Ser Pro GlyGlu Lys Ile Asn Leu Leu Lys Arg 260 265 270 Ser Leu Leu Ser Ile Pro AsnThr Asp Tyr Ile Gln Leu Leu Ser 275 280 285 Glu Ile Ala Lys Glu Gln GlyPhe Asn Ile Thr Tyr Leu Asp Ile 290 295 300 Asp Glu Leu Ser Ala Asn GlyGln Tyr Gln Cys Leu Ala Glu Leu 305 310 315 Ser Thr Ser Pro Ile Thr ValCys His Gly Ser Gly Ile Ser Cys 320 325 330 Gly Asn Ala Gln Ser Asp AlaAla His Asn Ala Leu Gln Tyr Leu 335 340 345 Lys Ile Ile Ala Glu Arg Lys350 432 amino acids amino acid single linear TLYMNOT04 2926777 47 MetIle Ser Ala Ala Gln Leu Leu Asp Glu Leu Met Gly Arg Asp 5 10 15 Arg AsnLeu Ala Pro Asp Glu Lys Arg Thr Asn Val Arg Trp Asp 20 25 30 His Glu SerVal Cys Lys Tyr Tyr Leu Cys Gly Phe Cys Pro Ala 35 40 45 Glu Leu Phe ThrAsn Thr Arg Ser Asp Leu Gly Pro Cys Glu Lys 50 55 60 Ile His Asp Glu AsnLeu Arg Lys Gln Tyr Glu Lys Ser Ser Arg 65 70 75 Phe Met Lys Val Gly TyrGlu Arg Asp Phe Leu Arg Tyr Leu Gln 80 85 90 Ser Leu Leu Ala Glu Val GluArg Arg Ile Arg Arg Gly His Ala 95 100 105 Arg Leu Ala Leu Ser Gln AsnGln Gln Ser Ser Gly Ala Ala Gly 110 115 120 Pro Thr Gly Lys Asn Glu GluLys Ile Gln Val Leu Thr Asp Lys 125 130 135 Ile Asp Val Leu Leu Gln GlnIle Glu Glu Leu Gly Ser Glu Gly 140 145 150 Lys Val Glu Glu Ala Gln GlyMet Met Lys Leu Val Glu Gln Leu 155 160 165 Lys Glu Glu Arg Glu Leu LeuArg Ser Thr Thr Ser Thr Ile Glu 170 175 180 Ser Phe Ala Ala Gln Glu LysGln Met Glu Val Cys Glu Val Cys 185 190 195 Gly Ala Phe Leu Ile Val GlyAsp Ala Gln Ser Arg Val Asp Asp 200 205 210 His Leu Met Gly Lys Gln HisMet Gly Tyr Ala Lys Ile Lys Ala 215 220 225 Thr Val Glu Glu Leu Lys GluLys Leu Arg Lys Arg Thr Glu Glu 230 235 240 Pro Asp Arg Asp Glu Arg LeuLys Lys Glu Lys Gln Glu Arg Glu 245 250 255 Glu Arg Glu Lys Glu Arg GluArg Glu Arg Glu Glu Arg Glu Arg 260 265 270 Lys Arg Arg Arg Glu Glu GluGlu Arg Glu Lys Glu Arg Ala Arg 275 280 285 Asp Arg Glu Arg Arg Lys ArgSer Arg Ser Arg Ser Arg His Ser 290 295 300 Ser Arg Thr Ser Asp Arg ArgCys Ser Arg Ser Arg Asp His Lys 305 310 315 Arg Ser Arg Ser Arg Glu ArgArg Arg Thr Arg Ser Arg Asp Arg 320 325 330 Arg Arg Ser Arg Ser His AspArg Ser Glu Arg Lys His Arg Ser 335 340 345 Arg Ser Arg Asp Arg Arg ArgSer Lys Ser Arg Asp Arg Lys Ser 350 355 360 Tyr Lys His Arg Ser Lys SerArg Asp Arg Glu Gln Asp Arg Lys 365 370 375 Ser Lys Glu Lys Glu Lys ArgGly Ser Asp Asp Lys Lys Ser Ser 380 385 390 Val Lys Ser Gly Ser Arg GluLys Gln Ser Glu Asp Thr Asn Thr 395 400 405 Glu Ser Lys Glu Ser Asp ThrLys Asn Glu Val Asn Gly Thr Ser 410 415 420 Glu Asp Ile Lys Ser Glu GlyAsp Thr Gln Ser Asn 425 430 180 amino acids amino acid single linearTESTNOT07 3217567 48 Met Ala Ala Ala Glu Glu Glu Asp Gly Gly Pro Glu GlyPro Asn 5 10 15 Arg Glu Arg Gly Gly Ala Gly Ala Thr Phe Glu Cys Asn IleCys 20 25 30 Leu Glu Thr Ala Arg Glu Ala Val Val Ser Val Cys Gly His Leu35 40 45 Tyr Cys Trp Pro Cys Leu His Gln Trp Leu Glu Thr Arg Pro Glu 5055 60 Arg Gln Glu Cys Pro Val Cys Lys Ala Gly Ile Ser Arg Glu Lys 65 7075 Val Val Pro Leu Tyr Gly Arg Gly Ser Gln Lys Pro Gln Asp Pro 80 85 90Arg Leu Lys Thr Pro Pro Arg Pro Gln Gly Gln Arg Pro Ala Pro 95 100 105Glu Ser Arg Gly Gly Phe Gln Pro Phe Gly Asp Thr Gly Gly Phe 110 115 120His Phe Ser Phe Gly Val Gly Ala Phe Pro Phe Gly Phe Phe Thr 125 130 135Thr Val Phe Asn Ala His Glu Pro Phe Arg Arg Gly Thr Gly Val 140 145 150Asp Leu Gly Gln Gly His Pro Ala Ser Ser Trp Gln Asp Ser Leu 155 160 165Phe Leu Phe Leu Ala Ile Phe Phe Phe Phe Trp Leu Leu Ser Ile 170 175 180137 amino acids amino acid single linear SPLNNOT10 3339274 49 Met SerSer Leu Ile Arg Arg Val Ile Ser Thr Ala Lys Ala Pro 5 10 15 Gly Ala IleGly Pro Tyr Ser Gln Ala Val Leu Val Asp Arg Thr 20 25 30 Ile Tyr Ile SerGly Gln Ile Gly Met Asp Pro Ser Ser Gly Gln 35 40 45 Leu Val Ser Gly GlyVal Ala Glu Glu Ala Lys Gln Ala Leu Lys 50 55 60 Asn Met Gly Glu Ile LeuLys Ala Ala Gly Cys Asp Phe Thr Asn 65 70 75 Val Val Lys Thr Thr Val LeuLeu Ala Asp Ile Asn Asp Phe Asn 80 85 90 Thr Val Asn Glu Ile Tyr Lys GlnTyr Phe Lys Ser Asn Phe Pro 95 100 105 Ala Arg Ala Ala Tyr Gln Val AlaAla Leu Pro Lys Gly Ser Arg 110 115 120 Ile Glu Ile Glu Ala Val Ala IleGln Gly Pro Leu Thr Thr Ala 125 130 135 Ser Leu 1600 base pairs nucleicacid single linear U937NOT01 133 50 CCCGGGGGCC CGGGCGGCAG GGCAAGCAGCGCGGCCTCGG CCTATGCGAC CGGTGGCGCC 60 GGCGCGGCTT CTGCCTGGAG AGGATTCAAGATGACCAACG AAGAACCTCT TCCCAAGAA 120 GTTCGATTGA GTGAAACAGA CTTCAAAGTTATGGCAAGAG ATGAGTTAAT TCTAAGATG 180 AAACAATATG AAGCATATGT ACAAGCTTTGGAGGGCAAGT ACACAGATCT TAACTCTAA 240 GATGTAACTG GCCTAAGAGA GTCTGAAGAAAAACTAAAGC AACAACAGCA GGAGTCTGC 300 CGCAGGGAAA ACATCCTTGT AATGCGACTAGCAACCAAGG AACAAGAGAT GCAAGAGTG 360 ACTACTCAAA TCCAGTACCT CAAGCAAGTCCAGCAGCCGA GCGTTGCCCA ACTGAGATC 420 ACAATGGTAG ACCCAGCGAT CAACTTGTTTTTCCTAAAAA TGAAAGGTGA ACTGGAACA 480 ACTAAAGACA AACTGGAACA AGCCCAAAATGAACTGAGTG CCTGGAAGTT TACGCCTGA 540 AGGTAAACAA ATCATACTCC CCAGTCAAGACTTCCCTGAC AGTCCCACTA CGAGAAAGC 600 GTGGTGGGAC AGCCAAGTAC TCGTTTCCACACCAAGACTC AGACTTTTTG AGCCAAAAA 660 AAGCCACATT CTTACACTGT CCAGCTTGTAATGGTTAATG TAAAACTTAC CAGATGAAC 720 TTGTGTTTCA GCTTTTTTCT TTTCCCCTTCCCCTTGCTTC AGAGGCCTGA TGGCGTCGG 780 CTATTCCGAA GAAGTGGCCA CCTCCGAAAAATTCCCCTTC TAGAACATGT AGACACTTG 840 GAAATGTTTC TGTTTGAAGA AAATAGAGGGAGAAACAGAA GTCTTAAGTC TGTGGCACA 900 TGTGTCTTCA GACAGTTTGA AGGAATGAAAACCTAGAGAT TTTAAATCAT GAATTGAAC 960 TGTAAAATTC CAGTAAAATG TAAAAACGGAATATGCATCG CTCTTAACCT TGAGCATA 1020 GACTTAGAGA CACTGTGTAT CAGTTTTGCCAATAAGACTG TGGACTTCAT GATTGTTG 1080 GAACTTCTGG GTCAAAACTC AAATGAGGTGAATTTTGCCT TTAAAGGGTT TATTTGCT 1140 GAACCAACTT TCAATAGTCA TGAGAGAATCAAATAATAGA TGTCCGTACA AGTAGCGC 1200 ATATTTAACC ATTTAGTTTG GGGCTCTATATTACTTGCTT GAGCCTTAAT CAATGTGG 1260 TTATTCAATG GTTTGTTCTT TGAATGGTTGCAAAAACTGT AGATAATCTT ACTGAGGA 1320 GTACAAACAT GAAGGTGTGG TATCAAACTTCAGGTTGAAA CTGTTTGAAG CATTATAA 1380 ATTCATTTCA CAACTAGATT GTATAAGGATATTAGCTGTG ATGAGACTCA CTGCATTA 1440 TTTTTTAGTG AATTTTATGA AATCCCCGTTCCATTCAACA GGCACATGTT TAAAAGAG 1500 TTGTCGTTGG TGTTAATGGG GGAATGTGTTCCTTCATTGT ATTTGGGCCT TTTGTATT 1560 ACTCTTGATA TTAAATTAAA TGTGCCTTGAAAAAAAAAAA 1600 1033 base pairs nucleic acid single linear U937NOT011762 51 GCCGCCGGGA GCCGGTGCGG CTGTGAGGGG CCGCGTCTCG CAGCAGCCGCCCGGACCGGC 60 ATGGTGTTGG GCGCCGGGCC CGCCTCGCCT GTCTCGGGGA GCCCAGGGTAAAGGCAGCA 120 TAATGCTAAC GCTAGCAAGT AAACTGAAGC GTGACGATGG TCTCAAAGGGTCCCGGACG 180 CAGCCACAGC GTCCGACTCG ACTCGGAGGG TTTCTGTGAG AGACAAATTGCTTGTTAAA 240 AGGTTGCAGA ACTTGAAGCT AATTTACCTT GTACATGTAA AGTGCATTTTCCTGATCCA 300 ACAAGCTTCA TTGTTTTCAG CTAACAGTAA CCCCAGATGA GGGTTACTACCAGGGTGGA 360 AATTTCAGTT TGAAACTGAA GTTCCCGATG CGTACAACAT GGTGCCTCCCAAAGTGAAA 420 GCCTGACCAA GATCTGGCAC CCCAACATCA CAGAGACAGG GGAAATATGTCTGAGTTTA 480 TGAGAGAACA TTCAATTGAT GGCACTGGCT GGGCTCCCAC AAGAACATTAAAGGATGTC 540 TTTGGGGATT AAACTCTTTG TTTACTGATC TTTTGAATTT TGATGATCCACTGAATATT 600 AAGCTGCAGA ACATCATTTG CGGGACAAGG AGGACTTCCG GAATAAAGTGGATGACTAC 660 TCAAACGTTA TGCCAGATGA TAAAAGGGGA CGATTGCAGG CCCATGGACTGTGTTACAG 720 TTGTCTCTAA CATGAAACAG CAAGAGGTAG CCCCCTCTCC CGTCCTCATGCTCCCTCTC 780 GTCCCCTGGA TTGCCCCAGT CCTGTGACCA TGTTGCCCTG AAGAAGACCATCTTCATGA 840 TGCTCATTGT AGATGGAGAA TTCAACATAA ATACAGCAAG AAAATGTGTTTGGGCTTCT 900 AAGAGTTGTC TGCTTACCTT AACATGTTTA CTTTTTTGAA CTTGTACTGTATAGGCTGT 960 GGTGAAATTC TTAAGAAGTT GTAATGAACT CAAAATTGAG GCCAGAGCTTGCTTTCCC 1020 TTCCCAAACA AAA 1033 1837 base pairs nucleic acid singlelinear U937NOT01 1847 52 TCGGAGAGGC ATCTGGGTTC GGACTGGGGC CGCCATGGGGAAAGTGAATG TGGCCAAGTT 60 GCGTTACATG AGCCGAGATG ACTTCAGGGT CTTGACCGCGGTTGAAATGG GCATGAAGA 120 CCATGAAATT GTTCCCGGCA GTTTGATTGC TTCTATAGCCAGCCTTAAAC ATGGTGGCT 180 TAATAAAGTT TTAAGAGAAT TAGTGAAACA TAAACTCATAGCTTGGGAGC GTACCAAAA 240 TGTCCAGGGC TATCGGTTGA CAAATGCAGG ATATGATTACCTAGCTTTGA AAACACTTT 300 TTCTAGGCAA GTAGTTGAGT CTGTTGGAAA CCAGATGGGTGTTGGCAAAG AATCAGATA 360 TTACATTGTT GCAAATGAAG AAGGACAACA ATTTGCATTAAAGCTTCACA GACTAGGAA 420 AACCTCGTTT CGAAATTTGA AAAACAAACG CGATTATCATAAACATAGGC ACAATGTGT 480 TTGGCTTTAT TTATCTCGTC TCTCTGCCAT GAAGGAATTTGCCTATATGA AGGCATTGT 540 TGAGAGGAAA TTTCCAGTTC CAAAGCCAAT TGATTACAATCGTCATGCAG TGGTCATGG 600 ACTCATAAAT GGTTATCCAC TATGTCAGAT ACACCATGTTGAAGATCCTG CATCAGTAT 660 TGATGAAGCT ATGGAACTAA TTGTCAAACT TGCAAATCATGGGCTGATTC ATGGAGATT 720 TAATGAATTT AATCTCATTT TGGATGAAAG TGACCATATCACCATGATTG ATTTTCCAC 780 GATGGTTTCA ACTTCTCATC CCAATGCTGA GTGGTATTTTGACAGAGATG TTAAATGCA 840 TAAAGATTTC TTTATGAAAC GTTTCAGCTA CGAAAGTGAGCTTTTTCCAA CTTTTAAGG 900 TATCAGGAGA GAAGACACTC TTGATGTGGA GGTTTCTGCCAGTGGCTACA CAAAGGAAA 960 GCAGGCAGAT GATGAACTGC TTCATCCATT AGGTCCAGATGATAAAAATA TTGAAACA 1020 AGAGGGATCT GAATTCTCAT TTTCAGATGG AGAAGTGGCAGAAAAAGCAG AGGTTTAC 1080 GTCAGAAAAT GAAAGTGAAC GGAACTGTCT AGAAGAATCAGAGGGCTGCT ATTGCAGA 1140 ATCTGGAGAC CCTGAACAAA TAAAGGAAGA CAGTTTATCAGAAGAGAGTG CTGATGCA 1200 GAGTTTTGAA ATGACTGAAT TCAATCAAGC TTTAGAAGAAATAAAAGGGC AGGTTGTT 1260 AAACAACTCT GTAACTGAAT TTTCTGAGGA GAAAAACAGAACTGAAAATT ACAACAGG 1320 AGATGGTCAG AGAGTTCAAG GAGGAGTCCC TGCTGGCTCTGACGAGTATG AAGATGAA 1380 CCCTCATCTA ATTGCCTTGT CGTCATTAAA TAGAGAATTCAGGCCTTTCA GAGATGAA 1440 AAATGTGGGA GCTATGAATC AGTATAGAAC AAGAACTCTGAGTATCACTT CTTCAGGC 1500 TGCTGTAAGC TGTTCAACAA TTCCTCCAGA ACTGGTGAAACAGAAGGTGA AACGTCAG 1560 GACAAAACAG CAAAAATCAG CTGTCAGACG TCGATTGCAGAAAGGAGAAG CAAATATA 1620 TACCAAGCAA CGTAGGGAAA ACATGCAAAA TATCAAATCAAGTTTGGAAG CAGCTAGC 1680 TTGGGGAGAA TAATATATTT AGGATCTTGG ATATGTTTAATATATTTTTT AAAGTTAC 1740 TAATTCCTTT TTGAGCCCTC ATTTGTCTTT TTTGAGCCAAGGCTATCATA TATTAATA 1800 TAAACCCTCC TTTCATCTAT AAAAAAAAAA AAAAAAA 18372031 base pairs nucleic acid single linear HMC1NOT01 9337 53 CGCGCCGGGACTCTGCCCAC TTCCACCAGA GACACATTGA GAAGGAGGAA ACTATGGCCT 60 CCAGGCTTCCGACGGCCTGG TCCTGTGAAC CAGTGACCTT TGAAGATGTA ACACTGGGT 120 TTACCCCGGAAGAGTGGGGA CTGCTGGACC TCAAACAGAA GTCCCTGTAC AGGGAAGTG 180 TGCTGGAGAACTACAGGAAC CTGGTCTCAG TGGAACATCA GCTTTCCAAA CCAGATGTG 240 TATCTCAGTTAGAGGAGGCA GAAGATTTCT GGCCAGTGGA GAGAGGAATT CCTCAAGAC 300 CCATTCCAGAGTATCCTGAG CTCCAGCTGG ACCCTAAATT GGATCCTCTT CCTGCTGAG 360 GTCCCCTAATGAACATTGAG GTTGTTGAGG TCCTCACACT GAACCAGGAG GTGGCTGGT 420 CCCGGAATGCCCAGATCCAG GCCCTATATG CTGAAGATGG AAGCCTGAGT GCAGATGCC 480 CCAGTGAGCAGGTCCAACAG CAGGGCAAGC ATCCAGGTGA CCCTGAGGCC GCGCGCCAG 540 GGTTCCGGCAGTTCCGTTAT AAGGACATGA CAGGTCCCCG GGAGGCCCTG GACCAGCTC 600 GAGAGCTGTGTCACCAGTGG CTACAGCCTA AGGCACGCTC CAAGGAGCAG ATCCTGGAG 660 TGCTGGTGCTGGAGCAGTTC CTAGGTGCAC TGCCTGTGAA GCTCCGGACA TGGGTGGAA 720 CGCAGCACCCAGAGAACTGC CAAGAGGTGG TGGCCCTGGT AGAGGGTGTG ACCTGGATG 780 CTGAGGAGGAAGTACTTCCT GGCAGGACAA CCTGCCGAGG GCACCACCTG CTGCCTCGA 840 GTCACTGCCCAGCAGGAGGA GAAGCAGGAG GATGCAGCCA TCTGCCCAGT GACAGTGCT 900 CCTGAGGAGCCAGTGACCTT CCAGGATGTG GCTGTGGACT TCAGCCGGGA GGAGTGGGG 960 CTGCTGGGCCCGACACAGAG GACCGAGTAC CGCGATGTGA TGCTGGAGAC CTTTGGGC 1020 CTGGTCTCTGTGGGGTGGGA GACTACACTG GAAAATAAAG AGTTAGCTCC AAATTCTG 1080 ATTCCTGAGGAAGAACCAGC CCCCAGCCTG AAAGTACAAG AATCCTCAAG GGATTGTG 1140 TTGTCCTCTACATTAGAAGA TACCTTGCAG GGTGGGGTCC AGGAAGTCCA AGACACAG 1200 TTGAAGCAGATGGAGTCTGC TCAGGAAAAA GACCTTCCTC AGAAGAAGCA CTTTGACA 1260 CGTGAGTCCCAGGCAAACAG TGGTGCTCTT GACACAAACC AAGTTTCGCT CCAGAAAA 1320 GACAACCCTGAGTCCCAGGC AAACAGTGGC GCTCTTGACA CAAACCAAGT TTTGCTCC 1380 AAAATTCCTCCTAGAAAACG ATTGCGCAAA CGTGACTCAC AAGTTAAAAG TATGAAAC 1440 AATTCACGTGTAAAAATTCA TCAGAAGAGC TGTGAAAGGC AAAAGGCCAA GGAAGGCA 1500 GGTTGTAGGAAAACCTTCAG TCGGAGTACT AAACAGATTA CGTTTATAAG AATTCACA 1560 GGGAGCCAAGTTTGCCGATG CAGTGAATGT GGTAAAATAT TCCGGAACCC AAGATACT 1620 TCTGTGCATAAGAAAATCCA TACCGGAGAG AGGCCCTATG TGTGTCAAGA CTGTGGGA 1680 GGATTTGTTCAGAGCTCTTC CCTCACACAG CATCAGAGAG TTCATTCTGG AGAGAGAC 1740 TTTGAATGTCAGGAGTGTGG GAGGACCTTC AATGATCGCT CAGCCATCTC CCAGCACC 1800 AGGACTCACACTGGCGCTAA GCCCTACAAG TGTCAGGACT GTGGAAAAGC CTTCCGCC 1860 AGCTCCCACCTCATCAGACA TCAGAGGACT CACACCGGGG AGCGCCCATA TGCATGCA 1920 AAATGTGGAAAGGCCTTCAC CCAGAGCTCA CACCTTATTG GGCACCAGAG AACCCACA 1980 AGGACAAAGCGAAAGAAGAA ACAGCCTACC TCATAGCTCT CAAGCCAGTT G 2031 1750 base pairsnucleic acid single linear HMC1NOT01 9476 54 GCCGCATGGT AACTCAGGCGCCGGGCGCAC TGTCCTAGCT GCTGGTTTTC CACGCTGGTT 60 TTAGCTCCCG GCGTCTGCAAAATGAAGATT GAGGAGGTGA AGAGCACTAC GAAGACGCA 120 CGCATCGCCT CCCACAGCCACGTGAAAGGG CTGGGGCTGG ACGAGAGCGG CTTGGCCAA 180 CAGGCGGCCT CAGGGCTTGTGGGCCAGGAG AACGCGCGAG AGGCATGTGG CGTCATAGT 240 GAATTAATCG AAAGCAAGAAAATGGCTGGA AGAGCTGTCT TGTTGGCAGG ACCTCCTGG 300 ACTGGCAAGA CAGCTCTGGCTCTGGCTATT GCTCAGGAGC TGGGTAGTAA GGTCCCCTT 360 TGCCCAATGG TGGGGAGTGAAGTTTACTCA ACTGAGATCA AGAAGACAGA GGTGCTGAT 420 GAGAACTTCC GCAGGGCCATTGGGCTGCGA ATAAAGGAGA CCAAGGAAGT TTATGAAGG 480 GAAGTCACAG AGCTAACTCCGTGTGAGACA GAGAATCCCA TGGGAGGATA TGGCAAAAC 540 ATTAGCCATG TGATCATAGGACTCAAAACA GCCAAAGGAA CCAAACAGTT GAAACTGGA 600 CCCAGCATTT TTGAAAGTTTGCAGAAAGAG CGAGTAGAAG CTGGAGATGT GATTTACAT 660 GAAGCCAACA GTGGGGCCGTGAAGAGGCAG GGCAGGTGTG ATACCTATGC CACAGAATT 720 GACCTTGAAG CTGAAGAGTATGTCCCCTTG CCAAAAGGGG ATGTGCACAA AAAGAAAGA 780 ATCATCCAAG ATGTGACCTTGCATGACTTG GATGTGGCTA ATGCGCGGCC CCAGGGGGG 840 CAAGATATCC TGTCCATGATGGGCCAGCTA ATGAAGCCAA AGAAGACAGA AATCACAGA 900 AAACTTCGAG GGGAGATTAATAAGGTGGTG AACAAGTACA TCGACCAGGG CATTGCTGA 960 CTGGTCCCGG GTGTGCTGTTTGTTGATGAG GTCCACATGC TGGACATTGA GTGCTTCA 1020 TACCTGCACC GCGCCCTGGAGTCTTCTATC GCTCCCATCG TCATCTTTGC ATCCAACC 1080 GGCAACTGTG TCATCAGAGGCACTGAGGAC ATCACATCCC CTCACGGCAT CCCTCTTG 1140 CTTCTGGACC GAGTGATGATAATCCGGACC ATGCTGTATA CTCCACAGGA AATGAAAC 1200 ATCATTAAAA TCCGTGCCCAGACGGAAGGA ATCAACATCA GTGAGGAGGC ACTGAACC 1260 CTGGGGGAGA TTGGCACCAAGACCACACTG AGGTACTCAG TGCAGCTGCT GACCCCGG 1320 AACTTGCTTG CTAAAATCAACGGGAAGGAC AGCATTGAGA AAGAGCATGT CGAAGAGA 1380 AGTGAACTTT TCTATGATGCCAAGTCCTCC GCCAAAATCC TGGCTGACCA GCAGGATA 1440 TACATGAAGT GAGATGGCTGAGGTTTTCAG CAGTAAGAGA CTCCCCAGGT GTGCCTGG 1500 TGGGTCCAGC CTGTGGGCGCTTGCCCCTGG GCTTGGGGCT GCCGTCCCCA CTCAGGCG 1560 GTCTGCAGCG CTGTCAGTTCAGTGTGGAAA GCATTTCTTT TTAAGTTATC GTAACTGT 1620 CTGTGGTTGC TTTGAAAGAACCCTTCCTTA CCTGGTGTGT TTTCTATAAA TCTTCATA 1680 TTATTTTGAT TCTCTCTCTCTCTCTCTCTA AGTTTTTTAA AAATAAACTT TTCAGAAC 1740 AAAAAAAAAA 1750 1234 basepairs nucleic acid single linear THP1PLB01 10370 55 GGGCGGGGCCGGTCGGTGAG TCAGCGGCTC TCTGATCCAG CCCGGGAGAG GACCGAGCTG 60 GAGGAGCTGGGTGTGGGGTG CGTTGGGCTG GTGGGGAGGC CTAGTTTGGG TGCAAGTAG 120 TCTGATTGAGCTTGTGTTGT GCTGAAGGGA CAGCCCTGGG TCTAGGGGAG AGAGTCCCT 180 AGTGTGAGACCCGCCTTCCC CGGTCCCAGC CCCTCCCAGT TCCCCCAGGG ACGGCCACT 240 CCTGGTCCCCGACGCAACCA TGGCTGAAGA ACAACCGCAG GTCGAATTGT TCGTGAAGG 300 TGGCAGTGATGGGGCCAAGA TTGGGAACTG CCCATTCTCC CAGAGACTGT TCATGGTAC 360 GTGGCTCAAGGGAGTCACCT TCAATGTTAC CACCGTTGAC ACCAAAAGGC GGACCGAGA 420 AGTGCAGAAGCTGTGCCCAG GGGGGCAGCT CCCATTCCTG CTGTATGGCA CTGAAGTGC 480 CACAGACACCAACAAGATTG AGGAATTTCT GGAGGCAGTG CTGTGCCCTC CCAGGTACC 540 CAAGCTGGCAGCTCTGAACC CTGAGTCCAA CACAGCTGGG CTGGACATAT TTGCCAAAT 600 TTCTGCCTACATCAAGAATT CAAACCCAGC ACTCAATGAC AATCTGGAGA AGGGACTCC 660 GAAAGCCCTGAAGGTTTTAG ACAATTACTT AACATCCCCC CTCCCAGAAG AAGTGGATG 720 AACCAGTGCTGAAGATGAAG GTGTCTCTCA GAGGAAGTTT TTGGATGGCA ACGAGCTCA 780 CCTGGCTGACTGCAACCTGT TGCCAAAGTT ACACATAGTA CAGGTGGTGT GTAAGAAGT 840 CCGGGGATTCACCATCCCCG AGGCCTTCCG GGGAGTGCAT CGGTACTTGA GCAATGCCT 900 CGCCCGGGAAGAATTCGCTT CCACCTGTCC AGATGATGAG GAGATCGAGC TCGCCTATG 960 GCAAGTGGCAAAGGCCCTCA AATAAGCCCC TCCTGGGACT CCCTCAACCC CCTCCATT 1020 CTCCACAAAGGCCCTGGTGG TTTCCACATT GCTACCCAAT GGACACACTC CAAAATGG 1080 AGTGGGCAGGGAATCCTGGA GCACTTGTTC CGGGATGGTG TGGTGGAAGA GGGGATGA 1140 GAAAGAAATGGGGGGCCTGG GTCAGATTTT TATTGTGGGG TGGGATGAGT AGGACAAC 1200 ATTTCAGTAATAAAATACAG AATAAAAAAA AAAA 1234 872 base pairs nucleic acid singlelinear THP1NOB01 30137 56 CACGCGTGCT GTCGGGGGAG GGATGCTGGG ACAGCTGCTCCCGCACACGG CTCGCGGTCT 60 CGGCGCCGCG GAGATGCCCG GCCAGGGTCC GGGGTCCGACTGGACGGAGC GTAGCTCTT 120 TGCAGAGCCG CCCGCTGTGG CCGGGACCGA GGGTGGCGGCGGCGGATCAG CTGGATACT 180 TTGTTACCAG AATTCCAAAG GTTCTGATAG AATCAAAGATGGATACAAAG TGAACTCAC 240 CATAGCTAAG CTGCAAGAGT TATGGAAAAC TCCCCAAAATCAAACAATCC ACCTCTCTA 300 ATCAATGATG GAGGCGTCCT TTTTCAAGCA TCCAGACCTCACCACAGGCC AGAAGCGTT 360 CCTGTGCAGC ATTGCTAAAA TCTATAATGC AAACTATCTGAAGATGTTAA TGAAGAGGC 420 GTACATGCAC GTACTTCAGC ACAGCTCACA AAAGCCAGGTGTCCTCACTC ATCACAGAA 480 CCGCCTTAGC TCCCGTTACT CACAGAAACA GCATTACCCTTGCACTACAT GGCGACATC 540 ACTGGAGAGA GAGGACTCGG GGTCTTCTGA TATCGCAGCTGCATCTGCAC CTGAAATGC 600 CATACAGCAT TCCCTTTGGC GGCCAGTGAG AAACAAAGAAGGGATAAAAA CTGGATATG 660 ATCTAAAACA AGATGTAAGT CACTGAAGAT TTTTAGAAGACCAAGGAAAC TGTTCATGC 720 AACAGTTTCT TCAGATGATT CTGAATCACA CATGAGTGGAGAAAAAAAGG GAAGAGGAT 780 TACTACATAA TTTTATGCAA TCCATGTCAA TTGAGGACAAGGGGGACATC TGATGTTNA 840 TTGACAGTCT TGTCTCGTGT ATTGAATTCG TG 872 691base pairs nucleic acid single linear SYNORAB01 77180 57 GGGAAAATGGCGCTGGCCAT GCTGGTCTTG GTGGTTTCGC CGTGGTCTGC GGCCCGGGGA 60 GTGCTTCGAAACTACTGGGA GCGACTGCTA CGGAAGCTTC CGCAGAGCCG GCCGGGCTT 120 CCCAGTCCTCCGTGGGGACC AGCATTAGCA GTACAGGGCC CAGCCATGTT TACAGAGCC 180 GCAAATGATACCAGTGGAAG TAAAGAGAAT TCCAGCCTTT TGGACAGTAT CTTTTGGAT 240 GCAGCTCCCAAAAATAGACG CACCATTGAA GTTAACCGGT GTAGGAGAAG AAATCCGCA 300 AAGCTTATTAAAGTTAAGAA CAACATAGAC GTTTGTCCTG AATGTGGTCA CCTGAAACA 360 AAACATGTCCTTTGTGCCTA CTGCTATGAA AAGGTGTGCA AGGAGACTGC AGAAATCAG 420 CGACAGATAGGGAAGCAAGA AGGGGGCCCT TTTAAGGCTC CCACCATAGA GACTGTGGT 480 CTGTACACAGGAGAGACACC GTCTGAACAA GATCAGGGCA AGAGGATCAT TGAACGAGA 540 AGAAAGCGACCATCCTGGTT CACCCAGAAT TGACACCAAA GATGTTAAAA GGATAACTT 600 ACAGTAAATCATTTCTCCTG AAATAGAGGA AGATTCTTTA CGTTGTTGTG CTTGTTTTT 660 AATCATCAGTATAGTTTAAC ACATTCTTTC T 691 1994 base pairs nucleic acid single linearPITUNOR01 98974 58 CGGCTCGAGG CGTCTTGGCG GCAGTTGGTG GAACCGGAGCTTCGAGTCCG TCCCCGGTGC 60 TGCCTGCGCG TTCACCTGAG TCTCGCTGGA GCTCTTCTCGCCCGCCCACC TCATCTCAA 120 CCACTTTCCG CGGGGAGCGG CGCCAAGCTG GGCCTTCCTCGGATCAGGCG TCCCCTGAA 180 TCGGCACGCC CCTCTGCGTC CCCCTTCGGT CCCGCTAGGACCCCGTCCGG GCTGCCGTC 240 CCTCGTCGCT ATGGCGCCCA CCATCCAGAC CCAGGCCCAGCGGGAGGATG GCCACAGGC 300 CAATTCCCAC CGGACTCTGC CTGAGAGGTC TGGAGTGGTCTGCCGAGTCA AGTACTGCA 360 TAGCCTCCCT GATATCCCCT TCGACCCCAA GTTCATCACCTACCCCTTCG ACCAGAACA 420 GTTCGTCCAG TACAAAGCCA CTTCCTTGGA GAAACAGCACAAACATGACC TCCTGACTG 480 GCCAGACCTG GGGGTCACCA TCGATCTCAT CAATCCTGACACCTACCGCA TCGACCCCA 540 TGTTCTTCTA GATCCAGCTG ATGAGAAACT TTTGGAAGAGGAGATTCAGG CCCCCACCA 600 CTCCAAGAGA TCCCAGCAGC ATGCGAAGGT GGTGCCATGGATGCGAAAGA CAGAGTACA 660 CTCCACTGAG TTCAACCGTT ATGGCATCTC CAATGAGAAGCCTGAGGTCA AGATTGGGG 720 TTCTGTGAAG CAGCAGTTTA CCGAGGAAGA AATATACAAAGACAGGGATA GCCAGATCA 780 AGCCATTGAG AAGACTTTTG AGGATGCCCA GAAATCAATCTCACAGCATT ACAGCAAAC 840 CCGAGTCACA CCGGTGGAGG TCATGCCTGT CTTCCCAGACTTTAAGATGT GGATCAATC 900 ATGTGCTCAG GTGATCTTTG ACTCAGACCC AGCCCCCAAGGACACGAGTG GTGCAGCTG 960 GTTGGAGATG ATGTCTCAGG CCATGATTAG GGGCATGATGGATGAGGAAG GGAACCAG 1020 TGTGGCCTAT TTCCTGCCTG TAGAAGAGAC GTTGAAGAAACGAAAGCGGG ACCAGGAG 1080 GGAGATGGAC TATGCACCAG ATGATGTGTA TGACTACAAAATTGCTCGGG AGTACAAC 1140 GAACGTGAAG AACAAAGCTA GCAAGGGCTA TGAGGAAAACTACTTCTTCA TCTTCCGA 1200 GGGTGACGGG GTTTACTACA ATGAGTTGGA AACCAGGGTCCGCCTTAGTA AGCGCCGG 1260 CAAGGCTGGG GTTCAGTCAG GCACCAACGC CCTGCTTGTGGTCAAACATC GGGACATG 1320 TGAGAAGGAA CTGGAAGCTC AGGAGGCACG GAAGGCCCAGCTAGAAAACC ACGAACCG 1380 GGAGGAAGAG GAAGAGGAGA TGGAGACAGA AGAGAAAGAAGCTGGGGGCT CAGATGAG 1440 GCAGGAGAAG GGCAGCAGCA GTGAGAAGGA GGGCAGTGAAGATGAGCACT CGGGCAGC 1500 GAGTGAACGG GAGGAAGGTG ACAGGGACGA GGCCAGTGACAAGAGTGGCA GTGGTGAG 1560 CGAGAGCAGC GAGGATGAGG CCCGGGCTGC CCGTGACAAAGAGGAGATCT TTGGCAGT 1620 TGCTGATTCT GAGGACGATG CCGACTCTGA TGATGAGGACAGAGGACAGG CCCAAGGT 1680 CAGTGACAAT GATTCAGACA GCGGCAGCAA TGGGGGTGGCCAGCGGAGCC GGAGCCAC 1740 CCGCAGCGCC AGTCCCTTCC CCAGTGGCAG CGAGCACTCGGCCCAGGAGG ATGGCAGT 1800 AGCTGCAGCT TCTGATTCCA GTGAAGCTGA TAGTGACAGTGACTGAGTCC CAGGGCAT 1860 AGGGCTGGTT CAGACACCAT TATTGTGAGC AGCAAAGCACTTTTCTAGTG GTCTGTTT 1920 GAGCCTTTCA CTTGTTTGTT CCCCACCCCC AAACCTTTGCTGTTAATAAA GTCAACTT 1980 CTTTAAAAAA AAAA 1994 1594 base pairs nucleicacid single linear MUSCNOT01 118160 59 CCGCCCCCGC CGGGCGTGTG TGTCGTGTGTGTTTGGGGCC CGCGCGGGTT GCGCGCCCTC 60 CGCCTTCGCG CCTCCTGCCC CCGAGGCCCTACTGCTGCCC CTGTGCCCCT CGCCCCGCC 120 GGCGTCGCGG GCCAACATGG GCCAGGAAGAGGAGCTGCTG AGGATCGCCA AAAAGCTGG 180 GAAGATGGTG GCCAGGAAGA ACACGGAAGGGGCCCTGGAC CTTCTGAAGA AGCTGCACA 240 CTGCCAGATG TCCATCCAGC TACTACAGACAACCAGGATT GGAGTTGCTG TTAATGGGG 300 CCGCAAGCAC TGCTCAGACA AGGAGGTGGTGTCCTTGGCC AAAGTCCTTA TCAAAAACT 360 GAAGCGGCTG CTAGACTCCC CTGGACCCCCAAAAGGAGAA AAAGGAGAGG AAAGAGAAA 420 GGCAAAGAAG AAGGAAAAAG GGCTTGAGTGTTCAGACTGG AAGCCAGAAG CAGGCCTTT 480 TCCACCAAGG AAAAAACGAG AAGACCCCAAAACCAGGAGA GACTCTGTGG ACTCCAAGT 540 TTCTGCCTCC TCCTCTCCAA AAAGACCATCGGTGGAAAGA TCAAACAGCA GCAAATCAA 600 AGCGGAGAGC CCCAAAACAC CTAGCAGCCCCTTGACCCCC ACGTTTGCCT CTTCCATGT 660 TCTCCTGGCC CCCTGCTATC TCACAGGGGACTCTGTCCGG GACAAGTGTG TGGAGATGC 720 GTCAGCAGCC CTGAAGGCGG ACGATGATTACAAGGACTAT GGAGTCAACT GTGACAAGA 780 GGCATCAGAA ATCGAAGATC ATATCTACCAAGAGCTCAAG AGCACGGACA TGAAGTACC 840 GAACCGCGTG CGCAGCCGCA TAAGCAACCTCAAGGACCCC AGGAACCCCG GCCTGCGGC 900 GAACGTGCTC AGTGGGGCCA TCTCCGCAGGGCTTATAGCC AAGATGACGG CAGAGGAAA 960 GGCCAGTGAT GAACTGAGGG AGTTGAGGAATGCCATGACC CAGGAGGCCA TCCGTGAG 1020 CCAGATGGCC AAGACTGGCG GCACCACCACTGACCTCTTC CAGTGCAGCA AATGCAAG 1080 GAAGAACTGC ACCTATAACC AGGTGCAGACACGCAGTGCT GATGAGCCCA TGACTACC 1140 TGTCTTATGC AATGAATGTG GCAATCGCTGGAAGTTCTGC TGATGGAACA GCCAGCCA 1200 AACAAGGTGA GGAAGAAGAA AGAGGAAGCGCTGAATTATC TGAACTGGAG AAGCAATA 1260 AATTAAAGTG AAGGAAAATA CTGAACTCTGTCTGAGTGGG ATGGTATGAG TTAGAGGA 1320 AATTCTCTTG CAAATTAATA ATCGGTCATTAGAAACAATT GGTTAATGGG GGAGCCTA 1380 TGGAGAATGA TGCTGAGAAT TTGTATTGATGAACCTCTTT TAGAAACTGC AGAGGGCT 1440 GCACGGTGGC TTATGGCTGT AATCTGCAAACTCTGGGAGG CTGAGGTGGG AGAATCGC 1500 AACCCCAGAA GTTTGAGTCC AGCCCAGGCAACACAGCAAG ACCCCATCTC TATAAAAA 1560 AAAAATAAAG AAATTGTAGA CGCCTCGGGGACAT 1594 1460 base pairs nucleic acid single linear TLYMNOR01 140516 60GGCCGTCCGG CCTCCCTGAC ATGCAGATTT CCACCCAGAA GACAGAGAAG GAGCCAGTGG 60TCATGGAATG GGCTGGGGTC AAAGACTGGG TGCCTGGGAG CTGAGGTAGC CACCGTTTC 120GCCTGGCCAG CCCTCTGGAC CCCGAGGTTG GACCCTACTG TGACACACCT ACCATGCGG 180CACTCTTCAA CCTCCTCTGG CTTGCCCTGG CCTGCAGCCC TGTTCACACT ACCCTGTCA 240AGTCAGATGC CAAAAAAGCC GCCTCAAAGA CGCTGCTGGA GAAGAGTCAG TTTTCAGAT 300AGCCGGTGCA AGACCGGGGT TTGGTGGTGA CGGACCTCAA AGCTGAGAGT GTGGTTCTT 360AGCATCGCAG CTACTGCTCG GCAAAGGCCC GGGACAGACA CTTTGCTGGG GATGTACTG 420GCTATGTCAC TCCATGGAAC AGCCATGGCT ACGATGTCAC CAAGGTCTTT GGGAGCAAG 480TCACACAGAT CTCACCCGTC TGGCTGCAGC TGAAGAGACG TGGCCGTGAG ATGTTTGAG 540TCACGGGCCT CCACGACGTG GACCAAGGGT GGATGCGAGC TGTCAGGAAG CATGCCAAG 600GCCTGCACAT AGTGCCTCGG CTCCTGTTTG AGGACTGGAC TTACGATGAT TTCCGGAAC 660TCTTAGACAG TGAGGATGAG ATAGAGGAGC TGAGCAAGAC CGTGGTCCAG GTGGCAAAG 720ACCAGCATTT CGATGGCTTC GTGGTGGAGG TCTGGAACCA GCTGCTAAGC CAGAAGCGC 780TGGGCCTCAT CCACATGCTC ACCCACTTGG CCGAGGCTCT GCACCAGGCC CGGCTGCTG 840CCCTCCTGGT CATCCCGCCT GCCATCACCC CCGGGACCGA CCAGCTGGGC ATGTTCACG 900ACAAGGAGTT TGAGCAGCTG GCCCCCGTGC TGGATGGTTT CAGCCTCATG ACCTACGAC 960ACTCTACAGC GCATCAGCCT GGCCCTAATG CACCCCTGTC CTGGGTTCGA GCCTGCGT 1020AGGTCCTGGA CCCGAAGTCC AAGTGGCGAA GCAAAATCCT CCTGGGGCTC AACTTCTA 1080GTATGGACTA CGCGACCTCC AAGGATGCCC GTGAGCCTGT TGTCGGGGCC AGGTACAT 1140AGACACTGAA GGACCACAGG CCCCGGATGG TGTGGGACAG CCAGGCCTCA GAGCACTT 1200TCGAGTACAA GAAGAGCCGC AGTGGGAGGC ACGTCGTCTT CTACCCAACC CTGAAGTC 1260TGCAGGTGCG GCTGGAGCTG GCCCGGGAGC TGGGCGTTGG GGTCTCTATC TGGGAGCT 1320GCCAGGGCCT GGACTACTTC TACGACCTGC TCTAGGTGGG CATTGCGGCC TCCGCGGT 1380ACGTGTTCTT TTCTAAGCCA TGGAGTGAGT GAGCAGGTGT GAAATACAGG CCTCCACT 1440GTTTGCTGTG AAAAAAAAAA 1460 1594 base pairs nucleic acid single linearSPLNNOT02 207452 61 CGGCTCGAGC GGCTCGAGCG GCTCGAGGCG GAGAGCGCGGAGCCGGGCCG CACCCGCCGA 60 GCCGTGAAAA AAGTACATCT CCTGGAAGGG ATGCTTTTTAGCTGAGCTCT GGTGGATGA 120 AGGAGCTAGC CTTGAAAAAC TTGATACTGA TGGACATTGTGTGGGCCAGA GGCAGGGAT 180 GTTGGCTATG ACCCCAAACC AGATGGCAGG AATAACACCAAGTTCCAGGT GGCAGTGGC 240 GGGTCTGTGT CTGGACTTGT TACTCGGGCG CTGATCAGTCCCTTCGACGT CATCAAGAT 300 CGTTTCCAGC TTCAGCATGA GCGCCTGTCT CGCAGTGACCCCAGCGCAAA GTACCATGG 360 ATCCTCCAGG CCTCTAGGCA GATTCTGCAG GAGGAGGGTCCGACAGCTTT CTGGAAAGG 420 CACGTCCCAG CTCAGATTCT CTCCATAGGC TATGGAGCTGTCCAATTCTT GTCATTTGA 480 ATGCTGACGG AGCTGGTCCA CAGAGGCAGC GTGTACGACGCCCGGGAATT CTCAGTGCA 540 TTTGTATGTG GTGGCCTGGC TGCCTGTATG GCCACCCTCACTGTGCACCC CGTGGATGT 600 CTGCGCACCC GCTTTGCAGC TCAGGGTGAG CCCAAGGTCTATAATACGCT GCGCCACGC 660 GTGGGGACCA TGTATAGGAG CGAAGGCCCC CAGGTTTTCTACAAAGGCTT GGCTCCCAC 720 TTGATCGCCA TCTTCCCCTA CGCCGGGCTG CAGTTCTCTTGCTACAGCTC CTTGAAGCA 780 CTGTACAAGT GGGCCATACC AGCCGAAGGA AAGAAAAATGAGAACCTCCA AAACCTGCT 840 TGTGGCAGTG GAGCTGGTGT CATCAGCAAG ACCCTGACATATCCGCTGGA CCTCTTCAA 900 AAGCGGCTAC AGGTTGGAGG GTTTGAGCAT GCCAGAGCTGCCTTTGGCCA GGTACGGAG 960 TACAAGGGCC TCATGGACTG TGCCAAGCAG GTGCTGCAAAAGGAAGGCGC CCTGGGCT 1020 TTCAAGGGCC TGTCCCCCAG CTTGCTGAAG GCTGCCCTCTCCACAGGCTT CATGTTCT 1080 TCGTATGAAT TCTTCTGTAA TGTCTTCCAC TGCATGAACAGGACAGCCAG CCAGCGCT 1140 GCGCAGGAAG GACCCCAGGT CTTCCCTGGA GGCAGCCTCCTGAAGGAAGG AAGATTCA 1200 CTCCACTGAG AGGTGCCGTC TGGCCCTTCC CTGCAGGCCAGCTGCCCCAA GCGGGGTA 1260 AGCCTTGAAC CCACCCAGCT GGGACACCAC CAGAAGGTCCAGGGCTCTCC CCATGAGA 1320 ATCAGAGGGA TGCAGGACGT GGTCTATGGT GAGCCAACGACACAGTGAGA AGGAGCAG 1380 AGTTGCTGTT TCTCCTCTGA CCAGCCCACA CTGCAAAGGAAACAGACGCC ATCCTACA 1440 TATCAGCCCT GCCTGCCAGG AGAACAGAGC ACACTCCTGGTCTGGATGGG GCTGCTGC 1500 GAGTGCAGAG GGCTGCGGTA GGCCCTTTGC AGGAGTCAGGTCCCTACACT TGGCCTGT 1560 GTGCAACCTA TTTAATAGAC GATTAAAGCC TAGA 1594 1249base pairs nucleic acid single linear SPLNNOT02 208836 62 TCTGCTAGGGCACAAGAGAG ACGGGCGCTC GCGCTCTCGC AGTCCTCTTC CGTCAGTGTC 60 TTTTGCTTCGACTCCCGGCG GAGCGCGCAA CGTGGAGTGA CGTGCAGGGG CCAAGTGCA 120 CCCAGGCAGCCACGGCTGTT TCGGAGCTCA GGACTCTAAA ATGGCAGAGC AGCTTTCTC 180 AGGAAAGGCGGTGGATCAGG TGTGCACCTT CCTTTTCAAA AAGCCTGGGC GGAAAGGGG 240 TGCTGGACGCAGAAAGCGCC CGGCCTGCGA CCCAGAGCCC GGAGAAAGCG GCAGCAGTA 300 CGACGAAGGCTGCACTGTGG TTCGACCGGA AAAGAAGCGG GTGACCCACA ATCCAATGA 360 GCAGAAGACCCGTGACAGTG GTAAACAGAA GGCGGCTTAC GGCGACTTGA GCAGCGAAG 420 GGAAGAGGAAAATGAGCCCG AGAGTCTCGG CGTGGTTTAT AAATCCACCC GTTCGGCGA 480 ACCCGTGGGACCAGAGGATA TGGGAGCGAC AGCCGTCTAT GAGCTGGACA CAGAGAAAG 540 GCGCGATGCACAAGCCATCT TTGAGCGCAG CCAGAAGATC CAGGAGGAGC TGAGGGGCA 600 GGAGGATGACAAGATCTATC GGGGAATCAA CAATTATCAG AAATACATGA AGCCCAAGG 660 TACGTCTATGGGCAATGCCT CTTCCGGGAT GGTGAGGAAG GGCCCCATCC GAGCGCCCG 720 GCATCTACGTGCCACCGTGC GCTGGGATTA CCAGCCCGAC ATCTGTAAGG ACTACAAAG 780 GACTGGCTTCTGCGGCTTCG GAGACAGCTG CAAATTCCTC CATGACCGTT CAGATTACA 840 GCATGGGTGGCAGATCGAAC GTGAGCTTGA TGAGGGTCGC TATGGTGTCT ATGAGGATG 900 AAACTATGAAGTGGGAAGCG ATGATGAGGA AATACCATTC AAGTGTTTCA TCTGTCGCC 960 GAGCTTCCAAAACCCAGTTG TCACCAAGTG CAGGCATTAT TTCTGCGAGA GCTGTGCA 1020 GCAGCATTTCCGCACCACCC CGCGCTGCTA TGTCTGTGAC CAGCAGACCA ATGGCGTC 1080 CAATCCAGCGAAAGAATTGA TTGCTAAACT AGAGAAGCAT CGAGCTACAG GAGAGGGT 1140 TGCTTCCGACTTGCCAGAAG ACCCCGATGA GGATGCAATT CCCATTACTT AGGTTTCC 1200 TAATTCTTAAATTTAAAAAA TAAACGTTTT GTTCTTTTGG AAAAAAAAA 1249 1309 base pairs nucleicacid single linear MMLR3DT01 569710 63 CAGGCCCGCA GACCGGAAGC AGCCCGCGCCGGGGGCTTCT GGGAAAAGGC TTGTGAACGG 60 CGTTTCTGCG TCTGCCGTGG ACAGCGAANTGCTTGCGGTT CCTGAGCCGG AGGAGTGCC 120 GTGAAGAAAA CGGGGTATTG CCCTGAGGCTTATATTCTGC CTCAGTTGTC TTTTCTTGA 180 ATATTATAAA TCAGAATGTC TGCACAGTCAGTGGAAGAAG ATTCAATACT TATCATCCC 240 ACTCCAGATG AAGAGGAAAA AATTCTGAGAGTGAAGTTGG AGGAGGATCC TGATGGCGA 300 GAGGGATCAA GTATCCCCTG GAACCATCTCCCAGACCCAG AGATTTTCCG ACAGCGATT 360 AGGCAGTTTG GATACCAGGA TTCACCTGGGCCCCGTGAGG CTGTGAGCCA GCTCCGAGA 420 CTTTGCCGTC TGTGGCTCAG GCCAGAGACGCACACAAAAG AACAAATCTT GGAGCTGGT 480 GTGCTGGAGC AGTTTGTTGC CATCCTACCCAAAGAGCTAC AGACTTGGGT TCGAGATCA 540 CATCCAGAGA ATGGAGAGGA GGCAGTGACAGTGCTGGAGG ATTTGGAGAG TGAACTTGA 600 GACCCTGGAC AACCGGTTTC TCTCCGTCGACGAAAACGGG AAGTACTAGT AGAAGACAT 660 GTATCTCAAG AAGAAGCTCA GGGATTACCAAGTTCTGAGC TTGATGCTGT GGAGAACCA 720 CTCAAGTGGG CATCCTGGGA GCTCCATTCCCTAAGGCACT GTGATGATGA TGGTAGGAC 780 GAAAATGGAG CACTAGCTCC AAAGCAGGAGCTTCCTTCAG CATTAGAATC CCATGAAGT 840 CCTGGCACTC TCAGTATGGG TGTTCCTCAAATTTTTAAAT ATGGAGAAAC CTGTTTCCC 900 AAGGGCAGGT TTGAAAGAAA GAGAAATCCCTCTCGAAAGA AACAACATAT ATGTGATGA 960 TGTGGAAAAC ACTTCAGTCA GGGCTCAGCCCTTATTCTTC ATCAAAGAAT TCACAGTG 1020 GAGAAACCTT ATGGATGTGT TGAGTGTGGGAAAGCATTCA GCCGAAGTTC CATTCTTG 1080 CAACACCAGA GAGTCCACAC TGGAGAAAAACCTTACAAAT GTCTTGAATG TGGGAAAG 1140 TTTAGCCAGA ATTCGGGGCT TATTAATCATCAGAGAATCC ATACTGGGGA GAAACCTT 1200 GAATGCGTTC AGTGTGGGAA ATCGTATAGTCAAAGCTCAA ATCTTTTTAG ACATCAGA 1260 AGACACAATG CAGAAAAACT TCTGAATGTTGTGAAAGTTT AAGAAATTG 1309 76 base pairs nucleic acid single linearBRSTTUT01 606742 64 TTTTTTTTTT TTTTTTGTTG GAAAGGAGAT GTTTATTTTCTTCTTCCCAT GCTATGGAAG 60 GACATTGTAT TCCGCA 76 1327 base pairs nucleicacid single linear COLNNOT01 611135 65 CCTACCCTCT TCTGTTGCTT TCTCCCTGTGGCTCGCGCCG TCCCCCGCCG CCCGTCGACC 60 CCGCTTCCAT GTCCCTGGCG GACACAGCTCCCAGGAACCT CCACGCCCAT GGCCACTAG 120 CAGAGGGAAT CCTCTATCAC CTCCTGCTGTTCCACCTCGA GCTGCGACGC AGACGACGA 180 GGCGTGCGCG GCACCTGCGA AGATGCTTCCCTGTGCAAGA GGTTTGCAGT AAGCATTGG 240 TACTGGCATG ACCCTTACAT ACAGCACTTTGTGAGACTGT CTAAAGAGAG GAAAGCCCC 300 GAAATCAACA GAGGATATTT TGCTCGAGTCCATGGTGTCA GTCAGCTTAT AAAGGCATT 360 CTACGGAAGA CAGAATGTCA TTGTCAAATTGTCAACCTTG GGGCAGGCAT GGATACCAC 420 TTCTGGAGAT TAAAGGATGA AGATCTTCTCCCAAGTAAAT ATTTTGAGGT TGACTTTCC 480 ATGATTGTCA CGAGAAAGCT GCACAGTATCAAATGCAAGC CTCCCCTATC CAGCCCCAT 540 CTAGAACTGC ATTCAGAGGA CACACTTCAGATGGATGGAC ACATACTGGA TTCAAAGAG 600 TATGCCGTTA TTGGAGCAGA TCTCCGAGACCTGTCTGAAC TGGAAGAGAA GCTAAAGAA 660 TGTAACATGA ATACACAATT GCCAACACTCCTGATAGCTG AATGTGTGCT GGTTTACAT 720 ACTCCAGAGC AGTCCGCAAA CCTCCTGAAGTGGGCAGCCA ACAGTTTTGA GAGAGCCAT 780 TTCATAAACT ACGAACAGGT GAACATGGGTGATCGGTTTG GGCAGATCAT GATTGAAAA 840 CTGCGGAGAC GCCAGTGTGA CCTGGCGGGAGTGGAGACCT GCAAGTCATT AGAGTCACA 900 AAAGAACGGC TCCTGTCGAA TGGGTGGGAAACAGCATCGG CCGTCGACAT GATGGAGTT 960 TACAACAGGT TACCTCGAGC TGAAGTGAGCAGGATAGAAT CACTTGAATT CCTGGATG 1020 ATGGAGCTGC TGGAGCAGCT CATGCGGCATTACTGCCTTT GCTGGGCAAC CAAAGGAG 1080 AATGAGCTTG GGCTGAAGGA GATAACTTATTAATCTGTCG AAGGCTTATG CCGAGCCA 1140 AGCCGAAGCC ACTTGCCCTC CTGGAGGAGACCTGCAAGCT CCCTGAGCGG TGGGCGGG 1200 TCGTCCGCAG GTCTCATCCC ACACTCTTGAGAAGCCTTGG TCACTACAGT GGTCGCAC 1260 GTTCCTCTTC CTGTTCCTGT TGACATGTCGTTGTTTAAAT AAATCTCACT TGCCACCA 1320 AAAAAAA 1327 1892 base pairs nucleicacid single linear BRSTNOT03 641127 66 AGGCGTGGGG TTCCACGTGG GAGGTGAGACTGGCCGGGCT TCCCCAGCCG CTGGCAGAGG 60 CGTTCAGGGG TCAGGAAATG GGTGTGCCGCGGGAGACTTG AATGAAAATT CTGTGGAGA 120 TCTTCAACAG AAAACACTTC AGGATCTGTTACATGAGCTT TCCTCCTGGC TAGTTTTGG 180 AGGCATGGCC AGTACAATTA CTGGAAGTCAGGATTGTATT GTGAATCATC GAGGGGAAG 240 GGATGGGGAG CCTGAACTAG ATATTTCCCCTTGTCAACAG TGGGGAGAAG CATCTTCTC 300 TATTTCCAGA AACAGGGACA GTGTGATGACTCTTCAAAGT GGTTGTTTCG AAAACATTG 360 AAGTGAAACA TATTTGCCTT TGAAAGTCTCAAGCCAAATA GACACACAAG ACTCTTCAG 420 GAAGTTCTGT AAGAATGAGC CTCAGGATCATCAGGAAAGC AGACGTCTCT TTGTAATGG 480 AGAAAGCACT GAGAGAAAAG TGATAAAGGGGGAAAGTTGT TCAGAGAACC TTCAAGTTA 540 ACTGGTGTCT GATGGACAAG AACTGGCCTCGCCATTGTTA AATGGTGAGG CAACTTGCC 600 GAATGGCCAG TTAAAAGAAT CTTTGGATCCCATTGACTGT AACTGCAAAG ACATTCATG 660 ATGGAAATCA CAGGTGGTCA GTTGTAGTCAGCAGAGAGGT CATACAGAGG AGAAACCCT 720 TGACCATAAT AACTGTGGGA AAATACTTAACACCAGCCCA GATGGTCATC CATATGAGA 780 AATCCACACT GCAGAGAAAC AATACGAAGGTAGTCAGTGT GGTAAGAACT TCAGTCAAA 840 CTCAGAGCTA CTACTTCATC AGAGAGACCACACAGAAGAA AAACCCTACA AATGTGAGC 900 ATGTGGGAAG GGCTTCACAA GGAGCTCGAGTCTGCTTATC CATCAGGCAG TCCACACAG 960 TGAGAAGCCT TATAAGTGTG ACAAGTGTGGGAAGGGCTTC ACCAGGAGCT CAAGTCTG 1020 CATCCATCAT GCCGTCCATA CAGGCGAAAAACCTTATAAA TGTGACAAGT GTGGGAAG 1080 CTTTAGTCAG AGCTCCAAAC TGCACATCCACCAGCGAGTC CACACTGGAG AGAAGCCC 1140 TGAGTGTGAG GAGTGTGGTA TGAGCTTCAGTCAGCGCTCA AACCTGCACA TCCACCAG 1200 AGTACACACA GGAGAGAGGC CCTACAAGTGTGGTGAGTGT GGGAAGGGCT TCAGTCAG 1260 CTCGAACCTT CACATTCACC GGTGCATCCACACAGGAGAG AAGCCTTACC AATGCTAT 1320 GTGTGGGAAG GGTTTCAGCC AGAGCTCGGATCTTCGCATC CATCTCAGAG TCCACACT 1380 AGAGAAGCCC TATCACTGTG GCAAGTGTGGGAAGGGATTT AGCCAGAGTT CCAAACTC 1440 CATCCACCAG AGAGTACATA CTGGAGAGAAGCCCTATGAG TGCAGCAAGT GTGGGAAG 1500 CTTCAGCCAG AGCTCCAACC TTCACATCCACCAGCGGGTT CACAAGAGAG ATCCTCGA 1560 CCATCCAGGT CTTCACAGCG CTCATACTGTAAACACTGTT AAATATTTAG TATCACTC 1620 ACTTTATATT CTACAAAGGA GAGAGATGTAAGGGTTATTT AGATATGTTC CCTCACTG 1680 AAATCACTCA TTCAACATAT TTAAGTATCAAGCACTTTGT TATGCTGTAC AATGAATG 1740 TTGCTCTTGT TTCTCAGATG GGTAGAGTAAAAGTGTCTGT ACTTTACCGT TCAACTAC 1800 GTTCTACCCA GCATTTTAAC GGCAAGAACTTTATATTTAT TCTCATAGCA GGGCATGT 1860 CCCTTTGATC ACAGGCTCTG AGAATGCTTT AT1892 843 base pairs nucleic acid single linear LUNGTUT02 691768 67GGAAGTGTCT TCAGGGAGAG GAAGCCGGCG GCCTCACTGC TATGGGCCGC AACAAGAAGA 60AGAAGCGAGA TGGTGACGAC CGGCGGCCGA GGCTCGTTCT TAGCTTCGAC GAGGAGAAG 120GGCGGGAGTA CCTGACAGGC TTCCACAAGC GGAAGGTCGA GCGAAAGAAG GCAGCCATT 180AGGAGATTAA GCAGCGGCTG AAAGAGGAGC AGAGGAAGCT TCGGGAGGAG CGCCACCAG 240AATACTTGAA GATGCTGGCA GAGAGAGAAG AGGCTCTGGA GGAGGCAGAT GAGCTGGAC 300GGTTGGTGAC AGCAAAGACG GAGTCGGTGC AGTATGACCA CCCCAACCAC ACAGTCACC 360TGACCACCAT CAGTGACCTG GACCTCTCGG GGGCCCGGCT GCTCGGGCTG ACCCCACCT 420AGGGAGGGGC TGGAGACAGG TCTGAGGAGG AGGCGTCATC CACGGAGAAA CCAACCAAA 480CCTTGCCCAG GAAGTCCAGA GACCCCCTGC TCTCTCAGCG GATCTCCTCC CTCACAGCA 540CACTACATGC ACACAGCCGC AAAAAGGTCA AGAGGAAACA TTCCCGACGG GCCCAGGAC 600CCAAAAAGCC CCCAAAGGGC CCTTCGTACC AGCAAAGGCC CAGCGGCGCC GTCTTCACA 660GCAAAGCACC GGCACAGCGG GGGAATTNAA GANCCGAGAA CGAAGCCGGT TGCCCCCAT 720CTAAGGCTTN CCGGGGANCT TGTTCCCTTG GTTCAGCTTT GGCTGTTCCC CTGTTAGNC 780CAGCCTTGNA ACTTAAGGTG TTGCCTTAAC CGGCATTGTT TGCCCGCTTG GCTGGTTTT 840 TTG843 1643 base pairs nucleic acid single linear SYNOOAT01 724157 68GGGAGGCTGA AGCAAGAGAA CCGCTTGAAC GGGGGTGGAT GTTGCAGTGA GCCAAGATGG 60TGCCACTGCA CTCCAGCCTG GCAACAGTGC GCAGAGCGAG ACGCCGTCTC AAAACAAAA 120TCTCACAGTG GCCCAGCCTC CTTTCCTGCC ATCCCTGGAG TTGTGGTCTG TTTGAGGTT 180GGTTTTCAGG ACTGAAGCTT CAAGATGGCT GACCAGGACC CTGCGGGCAT CAGCCCCCT 240CAGCAAATGG TGGCCTCAGG CACCGGGGCT GTGGTTACCT CTCTCTTCAT GACACCCCT 300GACGTGGTGA AGGTTCGCCT GCAGTCTCAG CGGCCCTCCA TGGCCAGCGA GCTGATGCC 360TCCTCCAGAC TGTGGAGCCT CTCCTATACC AAATGGAAGT GCCTCCTGTA TTGCAATGG 420GTCCTGGAGC CTCTGTACCT GTGCCCAAAT GGTGCCCGCT GTGCCACCTG GTTTCAAGA 480CCTACCCGCT TCACTGGCAC CATGGATGCC TTCGTGAAGA TCGTGAGGCA CGAGGGCAC 540AGGACCCTCT GGAGCGGCCT CCCCGCCACC CTGGTGATGA CTGTGCCAGC TACCGCCAT 600TACTTCACTG CCTATGACCA ACTGAAGGCC TTCCTGTGTG GTCGAGCCCT GACCTCTGA 660CTCTACGCAC CCATGGTGGC TGGCGCGCTG GCCCGCTTGG GCACCGTGAC TGTGATCAG 720CCCCTGGAGC TTATGCGGAC AAAGCTGCAG GCTCAGCATG TGTCGTACCG GGAGCTGGG 780GCCTGTGTTC GAACTGCAGT GGCTCAGGGT GGCTGGCGCT CACTGTGGCT GGGCTGGGG 840CCCACTGCCC TTCGAGATGT GCCCTTCTCA GCCCTGTACT GGTTCAACTA TGAGCTGGT 900AAGAGCTGGC TCAATGGGCT CAGGCCGAAG GACCAGACTT CTGTGGGCAT GAGCTTTGT 960GCTGGTGGCA TCTCAGGGAC GGTGGCTGCA GTGCTGACTC TACCCTTTGA CGTGGTAA 1020ACCCAACGCC AGGTCGCTCT GGGAGCGATG GAGGCTGTGA GAGTGAACCC CCTGCATG 1080GACTCCACCT GGCTGCTGCT GCGGAGGATC CGGGCCGAGT CGGGCACCAA GGGACTCT 1140GCAGGCTTCC TTCCTCGGAT CATCAAGGCT GCCCCCTCCT GTGCCATCAT GATCAGCA 1200TATGAGTTCG GCAAAAGCTT CTTCCAGAGG CTGAACCAGG ACCGGCTTCT GGGCGGCT 1260AAGGGGCAAG GAGGCAAGGA CCCCGTCTCT CCCACGGATG GGGAGAGGGC AGGAGGAG 1320CCAGCCAAGT GCCTTTTCCT CAGCACTGAG GGAGGGGGCT TGTTTCCCTT CCCTCCCG 1380GACAAGCTCC AGGGCAGGGC TGTCCCTCTG GGCGGCCCAG CACTTCCTCA GACACAAC 1440CTTCCTGCTG CTCCAGTCGT GGGGATCATC ACTTACCCAC CCCCCAAGTT CAAGACCA 1500TCTTCCAGCT GCCCCCTTCG TGTTTCCCTG TGTTTGCTGT AGCTGGGCAT GTCTCCAG 1560ACCAAGAAGC CCTCAGCCTG GTGTAGTCTC CCTGACCCTT GTTAATTCCT TAAGTCTA 1620GATGATGAAC TTCAAAAAAA AAA 1643 2029 base pairs nucleic acid singlelinear BRAITUT03 864683 69 GCAGCTGTGA GGGAGTCGCT GTGATCCGGG GCCCCGGAACCCGAGCTGGA GCTGAAGCGC 60 AGGCTGCGGG GCGCGGAGTC GGGAGTGCAG GCCTGAGTGTTCCTTCCAGC ATGTCGGAG 120 GGGAGTCCCA GACAGTACTT AGCAGTGGCT CAGACCCAAAGGTAGAATCC TCATCTTCA 180 CTCCTGGCCT GACATCAGTG TCACCTCCTG TGACCTCCACAACCTCAGCT GCTTCCCCA 240 AGGAAGAAGA AGAAAGTGAA GATGAGTCTG AGATTTTGGAAGAGTCGCCC TGTGGGCGC 300 GGCAGAAGAG GCGAGAAGAG GTGAATCAAC GGAATGTACCAGGTATTGAC AGTGCATAC 360 TGGCCATGGA TACAGAGGAA GGTGTAGAGG TTGTGTGGAATGAGGTACAG TTCTCTGAA 420 GCAAGAACTA CAAGCTGCAG GAGGAAAAGG TTCGTGCTGTGTTTGATAAT CTGATTCAA 480 TGGAGCATCT TAACATTGTT AAGTTTCACA AATATTGGGCTGACATTAAA GAGAACAAG 540 CCAGGGTCAT TTTTATCACA GAATACATGT CATCTGGGAGTCTGAAGCAA TTTCTGAAG 600 AGACCAAAAA GAACCACAAG ACGATGAATG AAAAGGCATGGAAGCGTTGG TGCACACAA 660 TCCTCTCTGC CCTAAGCTAC CTGCACTCCT GTGACCCCCCCATCATCCAT GGGAACCTG 720 CCTGTGACAC CATCTTCATC CAGCACAACG GACTCATCAAGATTGGCTCT GTGGCTCCT 780 ACACTATCAA CAATCATGTG AAGACTTGTC GAGAAGAGCAGAAGAATCTA CACTTCTTT 840 CACCAGAGTA TGGAGAAGTC ACTAATGTGA CAACAGCAGTGGACATCTAC TCCTTTGGC 900 TGTGTGCACT GGAGATGGCA GTGCTGGAGA TTCAGGGCAATGGAGAGTCC TCATATGTG 960 CACAGGAAGC CATCAGCAGT GCCATCCAGC TTCTAGAAGACCCATTACAG AGGGAGTT 1020 TTCAAAAGTG CCTGCAGTCT GAGCCTGCTC GCAGACCAACAGCCAGAGAA CTTCTGTT 1080 ACCCAGCATT GTTTGAAGTG CCCTCGCTCA AACTCCTTGCGGCCCACTGC ATTGTGGG 1140 ACCAACACAT GATCCCAGAG AACGCTCTAG AGGAGATCACCAAAAACATG GATACTAG 1200 CCGTACTGGC TGAAATCCCT GCAGGACCAG GAAGAGAACCAGTTCAGACT TTGTACTC 1260 AGTCACCAGC TCTGGAATTA GATAAATTCC TTGAAGATGTCAGGAATGGG ATCTATCC 1320 TGACAGCCTT TGGGCTGCCT CGGCCCCAGC AGCCACAGCAGGAGGAGGTG ACATCACC 1380 TCGTGCCCCC CTCTGTCAAG ACTCCGACAC CTGAACCAGCTGAGGTGGAG ACTCGCAA 1440 TGGTGCTGAT GCAGTGCAAC ATTGAGTCGG TGGAGGAGGGAGTCAAACAC CACCTGAC 1500 TTCTGCTGAA GTTGGAGGAC AAACTGAACC GGCACCTGAGCTGTGACCTG ATGCCAAA 1560 AGAATATCCC CGAGTTGGCG GCTGAGCTGG TGCAGCTGGGCTTCATTAGT GAGGCTGA 1620 AGAGCCGGTT GACTTCTCTG CTAGAAGAGA CCTTGAACAAGTTCAATTTT GCCAGGAA 1680 GTACCCTCAA CTCAGCCGCT GTCACCGTCT CCTCTTAGAGCTCACTCGGG CCAGGCCC 1740 ATCTGCGCTG TGGCTGTCCC TGGACGTGCT GCAGCCCTCCTGTCCCTTCC CCCCAGTC 1800 TATTACCCTG TGAAGCCCCT TCCCTCCTTT ATTATTCAGGAGGGCTGGGG GGGCTCCC 1860 GTTCTGAGCA TCATCCTTTC CCCTCCCCTC TCTTCCTCCCCTCTGCACTT TGTTTACT 1920 TTTTGCACAG ACGTGGGCCT GGGCCTTCTC AGCAGCCGCCTTCTAGTTGG GGGCTAGT 1980 CTGATCTGCC GGCTCCCGCC CAGCCTGTGT GGAAAGGAGGCCCACGGGC 2029 821 base pairs nucleic acid single linear CERVNOT01933353 70 CGGGTCGCCG CTGCGCCGGG CCGGGATGGC GGCCACCGCG CTGCTGGAGGCCGGCCTGGC 60 GCGGGTGCTC TTCTACCCGA CGCTGCTCTA CACCCTGTTC CGCGGGAAGGTGCCGGGTC 120 GGCGCACCGG GACTGGTACC ACCGCATCGA CCCCACCGTG CTGCTGGGCGCGCTGCCGT 180 GCGGAGCTTG ACGCGCCAGC TGGTACAGGA CGAGAACGTG CGCGGGGTGATCACCATGA 240 CGAGGAGTAC GAGACGAGGT TCCTGTGCAA CTCTTCACAG GAGTGGAAGAGACTAGGAG 300 TGAGCAGCTG CGGCTCAGCA CAGTAGACAT GACTGGGATC CCCACCTTGGACAACCTCC 360 GAAGGGAGTC CAATTTGCTC TCAAGTACCA GTCGCTGGGC CAGTGTGTTTACGTGCATT 420 TAAGGCTGGG CGCTCCAGGA GTGCCACTAT GGTGGCAGCA TACCTGATTCAGGTGCACA 480 ATGGAGTCCA GAGGAGGCTG TAAGAGCCAT CGCCAAGATC CGGTCATACATCCACATCA 540 GCCTGGCCAG CTGGATGTTC TTAAAGAGTT CCACAAGCAG ATTACTGCACGGGCAACAA 600 GGATGGGACT TTTGTCATTT CAAAGACATG ATGTATGGGG ATTAGAAAGAACTCAAGAC 660 CTCCTGCTTG ATACAGAACA AAAAGAGCTT AACAGGACCA ACAGGGCTTAAGCCCAGAC 720 TGACGTAACA GAAATGTGCC AATAGGTAAT AGGTAATTTT TCTTTCTCTGACTTGTTTT 780 TTTTCTTGAA ATAACACTGT TGTGTGGCTA GAAAAAAAAA A 821 1139base pairs nucleic acid single linear LATRTUT02 1404643 71 CGGCGACGGTGGGGAAGATG GCGTACCAGA GCTTGCGGCT GGAGTACCTG CAGATCCCAC 60 CGGTCAGCCGCGCCTACACC ACTGCCTGCG TCCTCACCAC CGCCGCCGTG CAGTTGGAA 120 TGATCACACCTTTTCAGTTG TACTTCAATC CTGAATTAAT CTTTAAACAC TTTCAAATA 180 GGAGATTAATCACCAACTTC TTATTTTTTG GGCCAGTTGG ATTCAATTTT TTATTTAAC 240 TGATTTTTCTATATCGTTAC TGTCGAATGC TAGAAGAAGG CTCTTTCCGA GGTCGGACA 300 CAGACTTTGTATTTATGTTC CTTTTTGGTG GATTCTTAAT GACCCTTTTT GGTCTGTTT 360 TGAGCTTAGTTTTCTTGGGC CAGGCCTTTA CAATAATGCT CGTCTATGTG TGGAGCCGA 420 GGAACCCCTATGTCCGCATG AACTTCTTCG GCCTTCTCAA CTTCCAGGCC CCCTTTCTG 480 CCTGGGTGCTCATGGGATTT TCCTTGTTGT TGGGGAACTC AATCATTGTG GACCTTTTG 540 GTATTGCAGTTGGACACATA TATTTTTTCT TGGAAGATGT ATTTCCCAAT CAACCTGGT 600 GAATAAGAATTCTGAAAACA CCATCTATTT TGAAAGCTAT TTTTGATACA CCAGATGAG 660 ATCCAAATTACAATCCACTA CCTGAGGAAC GGCCAGGAGG CTTCGCCTGG GGTGAGGGC 720 AGCGGCTTGGAGGTTAAAGC AGCAGTGCCA ATAATGAGAC CCAGCTGGGA AGGACTCGG 780 GATACCCACTGGGATCTTTT ATCCTTTGTT GCAAAAGTGT GGACACTTTT GACAGCTTG 840 CAGATTTTAACTCCAGAAGC ACTTTATGAA ATGGTACACT GACTAATCCA GAAGACATT 900 CCAACAGTTTGCCAGTGGTT CCTCACTACA CTGGTACTGA AAGTGTAATT TCTTAGAGC 960 AAAAAACTGGAGAAACAAAT ATCCTGCCAC CTCTAACAAG TACATGAGTA CTTGATTT 1020 ATGGTATAAGGCAGAGCCTT TTCTTCCTCT TCTTGATAGA TGAGGCCATG GTGTAAAT 1080 AAGTTTCAGAGAGGACAAAA TAAAACGGAA TTCCATTTTT CTCTCACTGT AAAAAAAA 1139 1406 basepairs nucleic acid single linear SPLNNOT04 1561587 72 GGGCTGCCCCCGGGGGGCTG GCGGACTGGG CCGCGGGGGC CCCGGGGCCG GCGGTGCCGG60 GGTCATCGGGATGATGCGGA CGCAGTGTCT GCTGGGGCTG CGCACGTTCG TGGCCTTCG120 CGCCAAGCTCTGGAGCTTCT TCATTTACCT TTTGCGGAGG CAGATCCGCA CGGTAATTC180 GTACCAAACTGTTCGATATG ATATCCTCCC CTTATCTCCT GTGTCCCGGA ATCGGCTAG240 CCAGGTGAAGAGGAAGATCC TGGTGCTGGA TCTGGATGAG ACACTTATTC ACTCCCACC300 TGATGGGGTCCTGAGGCCCA CAGTCCGGCC TGGTACGCCT CCTGACTTCA TCCTCAAGG360 GGTAATAGACAAACATCCTG TCCGGTTTTT TGTACATAAG AGGCCCCATG TGGATTTCT420 CCTGGAAGTGGTGAGCCAGT GGTACGAGCT GGTGGTGTTT ACAGCAAGCA TGGAGATCT480 TGGCTCTGCTGTGGCAGATA AACTGGACAA TAGCAGAAGC ATTCTTAAGA GGAGATATT540 CAGACAGCACTGCACTTTGG AGTTGGGCAG CTACATCAAG GACCTCTCTG TGGTCCACA600 TGACCTCTCCAGCATTGTGA TCCTGGATAA CTCCCCAGGG GCTTACAGGA GCCATCCAG660 CAATGCCATCCCCATCAAAT CCTGGTTCAG TGACCCCAGC GACACAGCCC TTCTCAACC720 GCTCCCAATGCTGGATGCCC TCAGGTTCAC CGCTGATGTT CGTTCCGTGC TGAGCCGAA780 CCTTCACCAACATCGGCTCT GGTGACAGCT GCTCCCCCTC CACCTGAGTT GGGGTGGGG840 GGAAAGGGAGGGCGAGCCCT TGGGATGCCG TCTGATGCCC TGTCCAATGT GAGGACTGC900 TGGGCAGGGTCTGCCCCTCC CACCCCTCTC TGCCCTGGGA GCCCTACACT CCACTTGGA960 TCTGGATGGACACATGGGCC AGGGGCTCTG AAGCAGCCTC ACTCTTAACT TCGTGTTC1020 ACTCCATGGAAACCCCAGAC TGGGACACAG GCGGAAGCCT AGGAGAGCCG AATCAGTG1080 TGTGAAGAGGCAGGACTGGC CAGAGTGACA GACATACGGT GATCCAGGAG GCTCAAAG1140 AAGCCAAGTCAGCTTTGTTG TGATTTGATT TTTTTTAAAA AACTCTTGTA CAAAACTG1200 CTAATTCTTCACTCCTGCTC CAAGGGCTGG GCTGTGGGTG GGATACTGGG ATTTTGGG1260 ACTGGATTTTCCCTAAATTT GTCCCCCCTT TACTCTCCCT CTATTTTTCT CTCCTTAG1320 TCCCTCAGACCTGTAACCAG CTTTGTGTCT TTTTTCCTTT TCTCTCTTTT AAACCATG1380 TTATAACTTTGAAACCAAAA AAAAAA 1406 2028 base pairs nucleic acid single linearUTRSNOT05 1568361 73 GNANANNACC CTGCNGGNNG CCATTCACCG ACCCTGCCCANACAGCCGTC ACCCTCGANT 60 CCTGGCTGAN TCTNTTCCTG GCAGTTCCCC TTATNANGGTTACAACTATG GCTCCTTTN 120 CAATGTNTCT NTATCTACCG ATGGTCTGGT TNACAGCNCTGGCACTGGGG ACCTCTCTT 180 CGGTTACCAG GGCCGCTCCT TTGAACCTGT AGGTACTCGGCCCCGAGTGG ACTCCATGA 240 CTCTGTGGAG GAGGATGACT ACGACACATT GACCGACATCGATTCCGACA AGAATGTCA 300 TCGCACCAAG CAATACCTCT ATGTGGCTGA CCTGGCACGGAAGGACAAGC GTGTTCTGC 360 GAAAAAGTAC CAGATCTACT TCTGGAACAT TGCCACCATTGCTGTCTTCT ATGCCCTTC 420 TGTGGTGCAG CTGGTGATCA CCTACCAGAC GGTGGTGAATGTCACAGGGA ATCAGGACA 480 CTGCTACTAC AACTTCCTCT GCGCCCACCC ACTGGGCAATCTCAGCGCCT TCAACAACA 540 CCTCAGCAAC CTGGGGTACA TCCTGCTGGG GCTGCTTTTCCTGCTCATCA TCCTGCAAC 600 GGAGATCAAC CACAACCGGG CCCTGCTGCG CAATGACCTCTGTGCCCTGG AATGTGGGA 660 CCCCAAACAC TTTGGGCTTT TCTACGCCAT GGGCACAGCCCTGATGATGG AGGGGCTGC 720 CAGTGCTTGC TATCATGTGT GCCCCAACTA TACCAATTTCCAGTTTGACA CATCGTTCA 780 GTACATGATC GCCGGACTCT GCATGCTGAA GCTCTACCAGAAGCGGCACC CGGACATCA 840 CGCCAGCGCC TACAGTGCCT ACGCCTGCCT GGCCATTGTCATCTTCTTNT CTGTGCTGG 900 CGTGGTCTTT GGCAAAGGGA ACACGGCGTT CTGGATCGTCTTCTCCATCA TTCACATCA 960 CGCCACCCTG CTCCTCAGCA CGCAGCTCTA TTACATGGGCCGGTGGAAAC TGGACTCG 1020 GATCTTCCGC CGCATCCTCC ACGTGCTCTA CACAGACTGCATCCGGCAGT GCAGCGGG 1080 GCTCTACGTG GACCGCATGG TGCTGCTGGT CATGGGCAACGTCATCAACT GGTCGCTG 1140 TGCCTATGGG CTTATCATGC GCCCCAATGA TTTCGCTTCCTACTTGTTGG CCATTGGC 1200 CTGCAACCTG CTCCTTTACT TCGCCTTCTA CATCATCATGAAGCTCCGGA GTGGGGAG 1260 GATCAAGCTC ATCCCCCTGC TCTGCATCGT TTGCACCTCCGTGGTCTGGG GCTTCGCG 1320 CTTCTTCTTC TTCCAGGGAC TCAGCACCTG GCAGAAAACCCCTGCAGAGT CGAGGGAG 1380 CAACCGGGAC TGCATCCTCC TCGACTTCTT TGACGACCACGACATCTGGC ACTTCCTC 1440 CTCCATCGCC ATGTTCGGGT CCTTCCTGGT GTTGCTGACACTGGATGACG ACCTGGAT 1500 TGTGCAGCGG GACAAGATCT ATGTCTTCTA GCAGGAGCTGGGCCCTTCGC TTCACCTC 1560 GGGGCCCTGA GCTCCTTTGT GTCATAGACC GGTCACTCTGTCGTGCTGTG GGGATGAG 1620 CCAGCACCGC TGCCCAGCAC TGGATGGCAG CAGGACAGCCAGGTCTAGCT TAGGCTTG 1680 CTGGGACAGC CATGGGGTGG CATGGAACCT TGCAGCTGCCCTCTGCCGAG GAGCAGGC 1740 GCTCCCCTGG AACCCCCAGA TGTTGGCCAA ATTGCTGCTTTCTTCTCAGT GTTGGGGC 1800 TCCATGGGCC CCTGTCCTTT GGCTCTCCAT TTGTCCCTTTGCAAGAGGAA GGATGGAA 1860 GACACCCTCC CCATTTCATG CCTTGCATTT TGCCCGTCCTCCTCCCCACA ATGCCCCA 1920 CTGGGACCTA AGGCCTCTTT TTCCTCCCAT ACTCCCACTCCAGGGCCTAG TCTGGGGC 1980 GAATCTCTGT CCTGTATCAG GGCCCCAGTT CTCTTTGGGCTGTCCCTG 2028 1380 base pairs nucleic acid single linear LNODNOT031572888 74 CTCGAGCCGA ATTCGGCTCG AGCGGCGCTC CTGCCTCCCT GCAGGGAGCTGCTTATGGGA 60 CACCGCTTCC TGCGCGGCCT CTTAACGCTG CTGCTGCCGC CGCCACCCCTGTATACCCG 120 CACCGCATGC TCGGTCCAGA GTCCGTCCCG CCCCCAAAAC GATCCCGCAGCAAACTCAT 180 GCACCGCCCC GAATCGGGAC GCACAATGGC ACCTTCCACT GCGACGAGGCACTGGCATG 240 GCACTGCTTC GCCTCCTGCC GGAGTACCGG GATGCAGAGA TTGTGCGGACCCGGGATCC 300 GAAAAACTCG CTTCCTGTGA CATCGTGGTG GACGTGGGGG GCGAGTACGACCCTCGGAG 360 CACCGATATG ACCATCACCA GAGGTCTTTC ACAGAGACCA TGAGCTCCCTGTCCCCTGG 420 AAGCCGTGGC AGACCAAGCT GAGCAGTGCG GGACTCATCT ATCTGCACTTCGGGCACAA 480 CTGCTGGCCC AGTTGCTGGG CACTAGTGAA GAGGACAGCA TGGTGGGCACCCTCTATGA 540 AAGATGTATG AGAACTTTGT GGAGGAGGTG GATGCTGTGG ACAATGGGATCTCCCAGTG 600 GCAGAGGGGG AGCCTCGATA TGCACTGACC ACTACCCTGA GTGCACGAGTTGCTCGACT 660 AATCCTACCT GGAACCACCC CGACCAAGAC ACTGAGGCAG GGTTCAAGCGTGCAATGGA 720 CTGGTTCAAG AGGAGTTTCT GCAGAGATTA GATTTCTACC AACACAGCTGGCTGCCAGC 780 CGGGCCTTGG TGGAAGAGGC CCTTGCCCAG CGATTCCAGG TGGACCCAAGTGGAGAGAT 840 GTGGAACTGG CGAAAGGTGC ATGTCCCTGG AAGGAGCATC TCTACCACCTGGAATCTGG 900 CTGTCCCCTC CAGTGGCCAT CTTCTTTGTT ATCTACACTG ACCAGGCTGGACAGTGGCG 960 ATACAGTGTG TGCCCAAGGA GCCCCACTCA TTCCAAAGCC GGCTGCCCCTGCCAGAGC 1020 TGGCGGGGTC TTCGGGACGA GGCCCTGGAC CAGGTCAGTG GGATCCCTGGCTGCATCT 1080 GTCCATGCAA GCGGCTTCAT TGGCGGTCAC CGCACCCGAG AGGGTGCCTTGAGCATGG 1140 CGTGCCACCT TGGCCCAGCG CTCATACCTC CCACAAATCT CCTAGTCTAATAAAACCT 1200 CATCTCATAC TGACAAAAAA AAAATGGTAG CTCGGGGGGC GTTAGGATAGATCTTTAG 1260 CCGGTGGGGT TTCTAAGCTG GAGAAGTGTT AGAAAAAAAG GGCGGCCGCCCATCAAAT 1320 GTTCTCTGCC CCGGGATTTT TTTTCGGGCC GGTACCCTTG GGGGGACCAAATTTCCCC 1380 2028 base pairs nucleic acid single linear LNODNOT031573677 75 CAAAAGGACA AGATAATAAA GTACAAAATG GTTCGTTACA TCAGAAGGATACAGTTCATG 60 ACAATGACTT TGAGCCCTAC CTTACTGGAC AGTCAAATCA GAGTAACAGTTACCCCTCA 120 TGAGCGACCC CTACCTGTCC AGCTATTACC CGCCGTCCAT TGGATTTCCTTACTCCCTC 180 ATGAGGCTCC GTGGTCTACT GCAGGGGACC CTCCGATTCC ATACCTCACCACCTACGGA 240 AGCTCAGTAA CGGAGACCAT CATTTTATGC ACGATGCTGT TTTTGGGCAGCCTGGGGGC 300 TGGGGAACAA CATCTATCAG CACAGGTTCA ATTTTTTCCC TGAAAACCCTGCGTTCTCA 360 CATGGGGGAC AAGTGGGTCT CAAGGTCAGC AGACCCAGAG CTCAGCCTCTCCCAGCACA 420 CCCCCAGCTT TGGCTCAACC GCAGTATCAG AGCCCTCAGC AGCCACCCCAGACCCGCTG 480 GTTGCCCCAC GCAACAGAAA CGCGGCGTTT GGGCAGAGCG GAGGGGCTGGCAGCGATAG 540 AACTCTCCTG GAAACGTCCA GCCTAATTCT GCCCCCAGCG TCGAATCCCACCCCGTCCT 600 GAAAAACTGA AGGCTGCTCA CAGCTACAAC CCGAAAGAGT TTGAGTGGAATCTGAAAAG 660 GGGCGTGTGT TCATCATCAA GAGCTACTCT GAGGACGACA TCCACCGCTCCATTAAGTA 720 TCCATCTGGT GTAGCACAGA GCACGGCAAC AAGCGCCTGG ACAGCGCCTTCCGCTGCAT 780 AGCAGCAAGG GGCCCGTCTA CCTGCTCTTC AGCGTCAATG GGAGTGGGCATTTTTGTGG 840 GTGGCCGAGA TGAAGTCCCC CGTGGACTAC GGCACCAGTG CCGGGGTCTGGTCTCAGGA 900 AAGTGGAAGG GGAAGTTTGA TGTCCAGTGG ATTTTTGTTA AGGATGTACCCAATAACCA 960 CTCCGGCACA TCAGGCTGGA GAATAACGAC AACAAACCGG TCACAAACTCCCGGGACA 1020 CAGGAGGTGC CCTTAGAAAA AGCCAAGCAA GTGCTGAAAA TTATCAGTTCCTACAAGC 1080 ACAACCTCCA TCTTCGACGA CTTTGCTCAC TACGAGAAGC GCCAGAGGAGGAGGAGGT 1140 TGCGCAAGGA ACGGCAGAGT CGAAACAAAC AATGAGGGCG AACCAGTTTCTTACATGT 1200 TAACGTTTGA CTTTGAAAAC AGTTTAAAAC ACGTGTGCTT GGTCAGCTCCAGTGTGTC 1260 CCCGTGCGGG GGTTGAGTGT TGCATCTTTG CCTTTCTTGT CGTTGATTTTTGCCCAGA 1320 GATCTGCATT TATTTGTACT TTTTCTATGT ATTATAATCC TGTAGAAGTCACTAATAA 1380 GAGTATTTTT TTTTGTCAGC TTATCAATCA GACTGATCTA ATGTGAAATGTAAGTATC 1440 TAAAAACAAA GCATCTATTT TGGCAGAAAT TGTGTTCTTA AATTCAGTCATTTGATAT 1500 TGTGAGACTT CATATTTCTC ATCCCTTTAT TGCTTTTTAG CAAACATAAGAAACCATG 1560 TCATTTTGTC ATTTAGAGTA TTCTGATAAA ATCTCTTGAA AATACTGAAATCAAAAGG 1620 AATGATTTTT TGTTCATTCT GATTTGTCAT TTTATTATCT GTTATCGGTCTAAAGTGC 1680 ATTTACCCAT TTGATTTTTC TGCTAGACAG ATAACTTTTA ATTTTTCAAATTTGGCAG 1740 ACTTTTTTTT TTTTTTTGAA AATCTTTCCT TCCAGATCTG TTGCCCACTGAACAGCCA 1800 CGTCCCTCAC TGTCCTGGTG TCCGATTGGG CTGGATGGTG TTGGGGCATGATGTGTGG 1860 GAACTGGAAG GTGCTTTAGG TCTGGTTCAG GGTCGGGCAT TCTTTGTTGTTTGCACAT 1920 TTTTAAATTT TACACCTTTT CTTAAGAATT CTAATGCCGT CTTAAGTTTTTATACCAA 1980 ATGCTGAGCT TTAAGTGTAG GATCTGGTAG TACAGACAGT GTGATGGA 20281170 base pairs nucleic acid single linear LNODNOT03 1574624 76TCCGCACTAG GCCCGCCCCA ATTTGCGTGT TTTTACCGTG CAGAGGGAGG GATTTAGAGT 60TAGCCTTTGA TTGGTCAGCT TGACTGGCGA CCTTTCCCCT CTGCGACAGT TTCCCGAGG 120ACCTAGTGTC TGAGCGGCAC AGACGAGATC TCGATCGAAG GCGAGATGGC GGACGTGCT 180GATCTTCACG AGGCTGGGGG CGAAGATTTC GCCATGGATG AGGATGGGGA CGAGAGCAT 240CACAAACTGA AAGAAAAAGC GAAGAAACGG AAGGGTCGCG GCTTTGGCTC CGAAGAGGG 300TCCCGAGCGC GGATGCGTGA GGATTATGAC AGCGTGGAGC AGGATGGCGA TGAACCCGG 360CCACAACGCT CTGTTGAAGG CTGGATTCTC TTTGTAACTG GAGTCCATGA GGAAGCCAC 420GAAGAAGACA TACACGACAA ATTCGCAGAA TATGGGGAAA TTAAAAACAT TCATCTCAA 480CTCGACAGGC GAACAGGATA TCTGAAGGGG TATACTCTAG TTGAATATGA AACATACAA 540GAAGCCCAGG CTGCTATGGA GGGACTCAAT GGCCAGGATT TGATGGGACA GCCCATCAG 600GTTGACTGGT GTTTTGTTCG GGGTCCACCA AAAGGCAAGA GGAGAGGTGG CCGAAGACG 660AGCAGAAGTC CAGACCGGAG ACGTCGCTGA CAGGTCCTCT GTTGTCCAGG TGTTCTCTT 720AAGATTCCAT TTGACCATGC AGCCTTGGAC AAATAGGACT GGGGTGGAAC TTGCTGTGT 780TATATTTAAT CTCTTACCGT ATATGCGTAG TATTTGAGTT GCGAATAAAT GTTCCATTT 840TGTTTTCTAC ATTTAATGTT ACTTTCCTGT TCCCAAAATT GAAAGTTCTA AAGCATAGC 900AGGCTGTATG GATCATTTGG AAAGATACCT TCTAGGGACT GAACCCCAAG TANTTCCTT 960TNTTCCCTTT TCCGAAAATA ATCCCTGCTG TGTTTACCCG GGTGTGATTG CCCCTGAT 1020TATCCCACTG CCGTCTCCAA ACTTTTGGGC TTCATCCTTC TTTCTGCCTC CTTGTCTC 1080ATTTTTTGGG ATAATAGGCC CTTCTCCCCT CCCCATCAAA TATTTTATTA TTTTTTTC 1140AAAAGGTTTT TCCGTTTTTC ACGGGGCCCT 1170 1107 base pairs nucleic acidsingle linear LNODNOT03 1577239 77 CCGAGCGCGG CCCCTGGGTT CGAACACGGCACCCGCACTG CGCGTCATGG TGCAGGCCTG 60 GTATATGGAC GACGCCCCGG GCGACCCGCGGCAACCCCAC CGCCCCGACC CCGGCCGCC 120 AGTGGGCCTG GAGCAGCTGC GGCGGCTCGGGGTGCTCTAC TGGAAGCTGG ATGCTGACA 180 ATATGAGAAT GATCCAGAAT TAGAAAAGATCCGAAGAGAG AGGAACTACT CCTGGATGG 240 CATCATAACC ATATGCAAAG ATAAACTACCAAATTATGAA GAAAAGATTA AGATGTTCT 300 CGAGGAGCAT TTGCACTTGG ACGATGAGATCCGCTACATC CTGGATGGCA GTGGGTACT 360 CGACGTGAGG GACAAGGAGG ACCAGTGGATCCGGATCTTC ATGGAGAAGG GAGACATGG 420 GACGCTCCCC GCGGGGATCT ATCACCGCTTCACGGTGGAC GAGAAGAACT ACACGAAGG 480 CATGCGGCTG TTTGTGGGAG AACCGGTGTGGACAGCGTAC AACCGGCCCG CTGACCATT 540 TGAAGCCCGC GGGCAGTACG TGAAATTTCTGGCACAGACC GCCTAGCAGT GCTGCCTGG 600 AACTAACACG CGCCTCGTAA AGGTCCCCAATGTAATGACT GAGCAGAAAA TCAATCACT 660 TCTCTTTGCT TTTAGAGGAT AGCCTTGAGGCTAGATTATC TTTCCTTTGT AAGATTATT 720 GATCAGAATA TTTTGTAATG AAAGGATCTAGAAAGCAACT TGGAAGTGTA AAGAGTCAC 780 TTCATTTTCT GTAACTCAAT CAAGACTGGTGGGTCCATGG CCCTGTGTTA GTTCATGCA 840 TCAGTTGAGT CCCAAATGAA AGTTTCATCTCCCGAAATGC AGTTCCTTAG ATGCCCATC 900 GGACGTGATG CCGCGCCTGC CGTGTAAGAAGGTGCAATCC TAGATAACAC AGCTAGCCA 960 ATAGAAGACA CTTTTTTCTC CAAAATGATGCCTTGGGGTG GGGAGTGGTA GGGGGAAG 1020 CTCCCACCCT AAGGGGCACA CACTGAGTTGCTTATGCCAC TTCCTTGTTC AAAATAAA 1080 AACTGCCTTA ATCTTATACT CATGGCT 11071075 base pairs nucleic acid single linear BLADNOT03 1598203 78CGGTAACCAG CCCTGGGAAG CCCGCAAGAG GCCTCAGCGG TGGCCGTCCG AGAGCCGAGA 60GGTGAGGGTG CCCCCGCCTC ACCTGCAGAG GGGCCGTTCC GGGCTCGAAC CCGGCACCT 120CCGGAAAATG GCGGCTGCCA GGCCCAGCCT GGGCCGAGTC CTCCCAGGAT CCTCTGTCC 180GTTCCTGTGT GACATGCAGG AGAAGTTCCG CCACAACATC GCCTACTTCC CACAGATCG 240CTCAGTGGCT GCCCGCATGC TCAAGGTGGC CCGGCTGCTT GAGGTGCCAG TCATGCTGA 300GGAGCAGTAC CCACAAGGCC TGGGCCCCAC GGTGCCCGAG CTGGGGACTG AGGGCCTTC 360GCCGCTGGCC AAGACCTGCT TCAGCATGGT GCCTGCCCTG CAGCAGGAGC TGGACAGTC 420GCCCCAGCTG CGCTCTGTGC TGCTCTGTGG CATTGAGGCA CAGGCCTGCA TCTTGAACA 480GACCCTGGAC CTCCTAGACC GGGGGCTGCA GGTCCATGTG GTGGTGGACG CCTGCTCCT 540ACGCAGCCAG GTGGACCGGC TGGTGGCTCT GGCCCGCATG AGACAGAGTG GTGCCTTCC 600CTCCACCAGC GAAGGGCTCA TTCTGCAGCT TGTGGGCGAT GCCGTCCACC CCCAGTTCA 660GGAGATCCAG AAACTCATCA AGGAGCCCGC CCCAGACAGC GGACTGCTGG GCCTCTTCC 720AGGCCAGAAC TCCCTCCTCC ACTGAACTCC AACCCTGCCT TGAGGGAAGA CCACCCTCC 780GTCACCCGGA CCTCAGTGGA AGCCCGTTCC CCCCATCCCT GGATCCCAAG AGTGGTGCG 840TCCACCAGGA GTGCCGCCCC CTTGTGGGGG GGGGCAGGGT GCTGCCTTCC CATTGGACA 900CTGCTCCCGG AAATGCAAAT GAGACTCCTG GAAACTGGGT GGGAATTGGC TGAGCCAAG 960TGGAGGCGGG GCTCGGCCCC GGGCCACTTC ACGGGGCGGG AAGGGGAGGG GAAGAAGA 1020CTCAGACTGT GGGACACGGA CTCGCAGAAT AAACATATAT GTGGCAAAAA AAAAA 1075 1830base pairs nucleic acid single linear BLADNOT03 1600438 79 GTCAATGTGTCTGTCCTTCA CTCCTCCATT GTCTGCCGCC ACTGCTGCTG CTGCTGCTGC 60 TGCCGCTGCTGCTGCACGAA TCGTCGCAGC CCCCAGCCTT GCGCGTCGTC GCTACCTCC 120 CGGACAGAAATTTTATGAAT AAGCATCAGA AGCCAGTGCT AACAGGCCAG CGGTTCAAA 180 CTCGGAAAAGGGATGAAAAA GAGAAATTCG AACCCACAGT CTTCAGGGAT ACACTTGTC 240 AGGGGCTTAATGAGGCTGGT GATGACCTTG AAGCTGTAGC CAAATTTCTG GACTCTACA 300 GCTCAAGATTAGATTATCGT CGCTATGCAG ACACACTCTT CGATATCCTG GTGGCTGGC 360 GTATGCTTGCCCCTGGAGGA ACGCGCATAG ATGATGGTGA CAAGACCAAG ATGACCAAC 420 ACTGTGTGTTTTCAGCAAAT GAAGATCATG AAACCATCCG AAACTATGCT CAGGTCTTC 480 ATAAACTCATCAGGAGATAT AAGTATTTGG AGAAGGCATT TGAAGATGAA ATGAAAAAG 540 TTCTCCTCTTCCTTAAAGCC TTTTCCGAAA CAGAGCAGAC AAAGTTGGCG ATGCTGTCG 600 GGATTCTGCTGGGCAATGGC ACCCTGCCCG CCACCATCCT CACCAGTCTC TTCACCGAC 660 GCTTAGTCAAAGAAGGCATT GCGGCCTCAT TTGCTGTCAA GCTTTTCAAA GCATGGATG 720 CAGAAAAAGATGCCAACTCT GTTACCTCGT CTTTGAGAAA AGCCAACTTA GACAAGAGG 780 TGCTTGAACTCTTTCCAGTT AACAGACAGA GTGTGGATCA TTTTGCTAAA TACTTCACT 840 ACGCAGGTCTTAAGGAGCTT TCCGACTTCC TCCGAGTCCA GCAGTCCCTG GGCACCAGG 900 AGGAACTGCAGAAGGAGCTC CAGGAGCGTC TTTCTCAGGA ATGCCCGATC AAGGAGGTG 960 TGCTTTATGTCAAAGAAGAA ATGAAGAGGA ATGATCTTCC AGAAACAGCA GTGATTGG 1020 TTCTGTGGACATGTATAATG AACGCTGTTG AGTGGAACAA GAAGGAAGAA CTTGTTGC 1080 AGCAGGCTCTGAAGCACCTG AAGCAATATG CTCCCCTGCT GGCCGTGTTC AGCTCCCA 1140 GCCAGTCAGAGCTGATCCTC CTCCAGAAGG TTCAGGAATA CTGCTACGAC AACATCCA 1200 TCATGAAAGCCTTTCAGAAG ATTGTGGTTC TCTTTTATAA AGCTGATGTT CTGAGCGA 1260 AAGCAATACTGAAATGGTAT AAGGAAGCAC ATGTTGCTAA AGGCAAAAGT GTTTTTCT 1320 ACCAGATGAAGAAATTTGTT GAGTGGTTAC AAAATGCAGA AGAAGAATCC GAATCGGA 1380 GTGAGGAAAATTAAATGGCT CAACAAGCAC AATACCTAGG TTACCACACA CCACTTTT 1440 ATTGGGAATGCTGAACCATT TGAGAAGAGA AACTTGGCTT CTGTTTTCGC AAAGGAAA 1500 AAAAATAGGATAGGCTTCCC TTGTGCAGAG GGAGAAATGG TTTTGTTTTT GTTTTGTT 1560 TAAATGGAGCCCTGAGGCAT CAGCTATTAT ACTTGGGACT CTACCTCTCA CTCACTAT 1620 GCTAACTTAAAGCCATTCAA CAAGGAGTCA AGTAGATCTG AAATTAAATA CTCAACAG 1680 TCCTCCTTTTTTAGCTGTAT TTTTCAGGTA CTGTGTGGTG ACCGCCCCAC TGGTGTCT 1740 TACAGGCCACTTTGGTAGTT GTGTATCTGC TCATGTATGT GATTTGACAA ACCAGTTT 1800 TAAAATAAATGGCTTTTTAA GAAAAAAAAA 1830 1330 base pairs nucleic acid single linearBLADNOT03 1600518 80 CCGGCAGCCA TCCCCGCGGT GCTGACATCC CGGTTGTTCTTCTGTGCCGG GGGTCTTCCT 60 GCTGTCATGA AGGACGTACC GGGCTTCCTA CAGCAGAGCCAGAGCTCCGG GCCCGGGCA 120 CCCGCTGTGT GGCACCGTCT GGAGGAGCTC TACACGAAGAAGTTGTGGCA TCAGCTGAC 180 CTTCAGGTGC TTGATTTTGT GCAGGATCCG TGCTTTGCCCAAGGAGATGG TCTCATTAA 240 CTTTATGAAA ACTTTATCAG TGAATTTGAA CACAGGGTGAATCCTTTGTC CCTCGTGGA 300 ATCATTCTTC ATGTAGTTAG ACAGATGACT GATCCTAATGTGGCTCTTAC TTTTCTGGA 360 AAGACTCGTG AGAAGGTGAA AAGTAGTGAT GAGGCAGTGATCCTGTGTAA AACAGCAAT 420 GGAGCTCTAA AATTAAACAT CGGGGACCTA CAGGTTACAAAGGAAACAAT TGAAGATGT 480 GAAGAAATGC TCAACAACCT TCCTGGTGTG ACATCGGTTCACAGTCGTTT CTATGATCT 540 TCCAGTAAAT ACTATCAAAC AATCGGAAAC CACGCGTCCTACTACAAAGA TGCTCTGCG 600 TTTTTGGGCT GTGTTGACAT CAAGGATCTA CCAGTGTCTGAGCAGCAGGA GAGAGCCTT 660 ACGCTGGGGC TAGCAGGACT TCTCGGCGAG GGAGTTTTTAACTTTGGAGA ACTCCTCAT 720 CACCCTGTGC TGGAGTCCCT GAGGAATACT GACCGGCAGTGGCTGATTGA CACCCTCTA 780 GCCTTCAACA GTGGCAACGT AGAGCGGTTC CAGACTCTGAAGACTGCCTG GGGCCAGCA 840 CCTGATTTAG CAGCTAATGA AGCCCAGCTT CTGAGGAAAATTCAGTTGTT GTGCCTCAT 900 GAGATGACTT TCACACGACC TGCCAATCAC AGACAACTCACTTTTGAAGA AATTGCCAA 960 AGTGCTAAAA TCACAGTGAA TGAGGTGGAG CTTCTGGTGATGAAGGCCCT TTCGGTGG 1020 CTGGTGAAAG GCAGTATAGA CGAGGTGGAC AAACGAGTCCACATGACCTG GGTGCAGC 1080 CGAGTGTTGG ATTTGCAACA GATCAAGGGA ATGAAGGACCGCCTGGAGTT CTGGTGCA 1140 GATGTGAAGA GCATGGAGAT GCTGGTGGAG CACCAGGCCCATGACATCCT CACCTAGG 1200 CCCCTGGTTC CCCGTCGTGT CTCCTTTGAC TCACCTGAGAGAGGCGTTTG CAGCCAAT 1260 AGCTGGCTGC TCAGACGGTC GACATTGAAT TTGGGTGGGGGTTGGGATCC TGTCTGAA 1320 ACAGAATGTT 1330 1152 base pairs nucleic acidsingle linear BLADNOT03 1602473 81 CGAGCCGGCG CACCGTACGC TGGGACGTGTGGTTTCAGCT CGTGCGCCTC CCCGTGGGTT 60 TGCGACGTTT AGCGACTATT GCGCCTGCGCCAGCGCCGGC TGCGAGACTG GGGCCGTGG 120 TGCTGGTCCC GGGTGATGCT AGGCGGCTCCCTGGGCTCCA GGCTGTTGCG GGGTGTAGG 180 GGGAGTCACG GACGGTTCGG GGCCCGAGGTGTCCGCGAAG GTGGCGCAGC CATGGCGGC 240 GGGGAGAGCA TGGCTCAGCG GATGGTCTGGGTGGACCTGG AGATGACAGG ATTGGACAT 300 GAGAAGGACC AGATTATTGA GATGGCCTGTCTGATAACTG ACTCTGATCT CAACATTTT 360 GCTGAAGGTC CTAACCTGAT TATAAAACAACCAGATGAGT TGCTGGACAG CATGTCAGA 420 TGGTGTAAGG AGCATCACGG GAAGTCTGGCCTTACCAAGG CAGTGAAGGA GAGTACAAT 480 ACATTGCAGC AGGCAGAGTA TGAATTTCTGTCCTTTGTAC GACAGCAGAC TCCTCCAGG 540 CTCTGTCCAC TTGCAGGAAA TTCAGTTCATGAAGATAAGA AGTTTCTTGA CAAATACAT 600 CCCCAGTTCA TGAAACATCT TCATTATAGAATAATTGATG TGAGCACTGT TAAAGAACT 660 TGCAGACGCT GGTATCCAGA AGAATATGAATTTGCACCAA AGAAGGCTGC TTCTCATAG 720 GCACTTGATG ACATTAGTGA AAGCATCAAAGAGCTTCAGT TTTACCGAAA TAACATCTT 780 AAGAAAAAAA TAGATGAAAA GAAGAGGAAAATTATAGAAA ATGGGGAAAA TGAGAAGAC 840 GTGAGTTGAT GCCAGTTATC ATGCTGCCACTACATCGTTA TCTGGAGGCA ACTTCTGGT 900 GTTTTTTTTT CTCACGCTGA TGGCTTGGCAGAGCACCTTC GGTTAACTTG CATCTCCAG 960 TTGATTACTC AAGCAGACAG CACACGAAATACTATTTTTC TCCTAATATG CTGTTTCC 1020 TATGACACAG CAGCTCCTTT GTAAGTACCAGGTCATGTCC ATCCCTTGGT ACATATAT 1080 ATTTGCTTTT AAACCATTTC TTTTGTTTAAATAAATAAAT AAGTAAATAA AGCTAGTT 1140 ATTGAAATGC AA 1152 566 base pairsnucleic acid single linear LUNGNOT15 1605720 82 GTTCTCCGCC CCTGCCACTGGGCCATGGAG ACTGTGGCAC AGTAGACTGT AGTGTGAGGC 60 TCGCGGGGGC AGTGGCCATGGAGGCCGTGC TGAACGAGCT GGTGTCTGTG GAGGACCTG 120 TGAAGTTTGA AAAGAAATTTCAGTCTGAGA AGGCAGCAGG CTCGGTGTCC AAGAGCACG 180 AGTTTGAGTA CGCCTGGTGCCTGGTGCGGA GCAAGTACAA TGATGACATC CGTAAAGGC 240 TCGTGCTGCT CGAGGAGCTGCTGCCCAAAG GGAGCAAGGA GGAACAGCGG GATTACGTC 300 TCTACCTGGC CGTGGGGAACTACCGGCTCA AGGAATACGA GAAGGCCTTA AAGTACGTC 360 GCGGGTTGCT GCAGACAGAGCCCCAGAACA ACCAGGCCAA GGAACTGGAG CGGCTCATT 420 ACAAGGCCAT GAAGAAAGATGGACTCGTGG GCATGGCCAT CGTGGGAGGC ATGGCCCTG 480 GTGTGGCGGG ACTTGCCGGACTCATCGGAC TTGCTGTGTC CAAGTCCAAA TTCTGAAGG 540 GACGCGGGAG CCCACGGAGAACGCTC 566 745 base pairs nucleic acid single linear COLNTUT06 161050183 CGGAAGAGGT AGCTCACGCG ATAGAAACGT GTTCGCTGCC CAGAAGAAGG GAAGGCGCGA 60GTGAGGAAAG GAGGTACTGT AGATGCCCTC CAAATCCTTG GTTATGGAAT ATTTGGCTC 120TCCCAGTACA CTCGGCTTGG CTGTTGGAGT TGCTTGTGGC ATGTGCCTGG GCTGGAGCC 180TCGAGTATGC TTTGGGATGC TCCCCAAAAG CAAGACGAGC AAGACACACA CAGATACTG 240AAGTGAAGCA AGCATCTTGG GAGACAGCGG GGAGTACAAG ATGATTCTTG TGGTTCGAA 300TGACTTAAAG ATGGGAAAAG GGAAAGTGGC TGCCCAGTGC TCTCATGCTG CTGTTTCAG 360CTACAAGCAG ATTCAAAGAA GAAATCCTGA AATGCTCAAA CAATGGGAAT ACTGTGGCC 420GCCCAAGGTG GTGGTCAAAG CTCCTGATGA AGAAACCCTG ATTGCATTAT TGGCCCATG 480AAAAATGCTG GGACTGACTG TAAGTTTAAT TCAAGATGCT GGACGTACTC AGATTGCAC 540AGGCTCTCAA ACTGTCCTAG GGATTGGGCC AGGACCAGCA GACCTAATTG ACAAAGTCA 600TGGTCACCTA AAACTTTACT AGGTGGACTT TGATATGACA ACAACCCCTC CATCACAAG 660GTTTGAAGCC TGTCAGATTC TAACAACAAA AGCTGAATTT CTTCACCCAA CTTAAATGT 720CTTGAGATGA AAATAAAACC TATTA 745 909 base pairs nucleic acid singlelinear BLADNOT06 1720770 84 TGAGGGAGAC CGCGGCTCGG CCGTAGCGGA GCTGCGAGGTGGCAGGGCCC AGCCCCGAAC 60 CAGACAAGGG ACCCCTCAAG GAGCTTCATT CTAGCAGGAGAAAATTGAGA AGTAAACCA 120 AAAGTTACAG AATGTCTGAA GGGGACAGTG TGGGAGAATCCGTCCATGGG AAACCTTCG 180 TGGTGTACAG ATTTTTCACA AGACTTGGAC AGATTTATCAGTCCTGGCTA GACAAGTCC 240 CACCCTACAC GGCTGTGCGA TGGGTCGTGA CACTGGGCCTGAGCTTTGTC TACATGATT 300 GAGTTTACCT GCTGCAGGGT TGGTACATTG TGACCTATGCCTTGGGGATC TACCATCTA 360 ATCTTTTCAT AGCTTTTCTT TCTCCCAAAG TGGATCCTTCCTTAATGGAA GACTCAGAT 420 ACGGTCCTTC GCTACCCACC AAACAGAACG AGGAATTCCGCCCCTTCATT CGAAGGCTC 480 CAGAGTTTAA ATTTTGGCAT GCGGCTACCA AGGGCATCCTTGTGGCTATG GTCTGTACT 540 TCTTCGACGC TTTCAACGTC CCGGTGTTCT GGCCGATTCTGGTGATGTAC TTCATCATG 600 TCTTCTGTAT CACGATGAAG AGGCAAATCA AGCACATGATTAAGTACCGG TACATCCCG 660 TCACACATGG GAAGAGAAGG TACAGAGGCA AGGAGGATGCCGGCAAGGCC TTCGCCAGC 720 AGAAGCGGGA CTGAGGCTGC CTCACGTGTT GCAAGAACAGTTTTGAGCCA TTGTTAACA 780 TGCCTTTTTT CTTCACATAA AGTAGTTGAT TACGAGGGAGTCAAATTTTC TTTTTAAAA 840 GGAGCTTCAA TGATTTGTAA CTGAAATATC AGGTTCTAGAAGAAACTGGC GCTTAAACC 900 AAAAAAAAA 909 2028 base pairs nucleic acidsingle linear BRAINON01 1832295 85 CCGGCGGGCG GGGGCTGAGG GGCTGCCATGGCGGCGGCGG GCCGGCTCCC GAGCTCCTGG 60 GCCCTCTTCT CGCCGCTCCT CGCAGGGCTTGCACTACTGG GAGTCGGGCC GGTCCCAGC 120 CGGGCGCTGC ACAACGTCAC GGCCGAGCTCTTTGGGGCCG AGGCCTGGGG CACCCTTGC 180 GCTTTCGGGG ACCTCAACTC CGACAAGCAGACGGATCTCT TCGTGCTGCG GGAAAGAAA 240 GACTTAATCG TCTTTTTGGC AGACCAGAATGCACCCTATT TTAAACCCAA AGTAAAGGT 300 TCTTTCAAGA ATCACAGTGC ATTGATAACAAGTGTAGTCC CTGGGGATTA TGATGGAGA 360 TCTCAAATGG ATGTCCTTCT GACATATCTTCCCAAAAATT ATGCCAAGAG TGAATTAGG 420 GCTGTTATCT TCTGGGGACA AAATCAAACATTAGATCCTA ACAATATGAC CATACTCAA 480 AGGACTTTTC AAGATGAGCC ACTAATTATGGATTTCAATG GTGATCTAAT TCCTGATAT 540 TTTGGTATCA CAAATGAATC CAACCAGCCACAGATACTAT TAGGAGGGAA TTTATCATG 600 CATCCAGCAT TGACCACTAC AAGTAAAATGCGAATTCCAC ATTCTCATGC ATTTATTGA 660 CTGACTGAAG ATTTTACAGC AGATTTATTCCTGACGACAT TGAATGCCAC CACTAGTAC 720 TTCCAGTTTG AAATATGGGA AAATTTGGATGGAAACTTCT CTGTCAGTAC TATATTGGA 780 AAACCTCAAA ATATGATGGT GGTTGGACAGTCAGCATTTG CAGACTTTGA TGGAGATGG 840 CACATGGATC ATTTACTGCC AGGCTGTGAAGATAAAAATT GCCAAAAGAG TACCATCTA 900 TTAGTGAGAT CTGGGATGAA GCAGTGGGTTCCAGTCCTAC AAGATTTCAG CAATAAGGG 960 ACACTCTGGG GCTTTGTGCC ATTTGTGGATGAACAGCAAC CAACTGAAAT ACCAATTC 1020 ATTACCCTTC ATATTGGAGA CTACAATATGGATGGCTATC CAGACGCTCT GGTCATAC 1080 AAGAACACAT CTGGAAGCAA CCAGCAGGCCTTTTTACTGG AGAACGTCCC TTGTAATA 1140 GCAAGCTGTG AAGAGGCGCG TCGAATGTTTAAAGTCTACT GGGAGCTGAC AGACCTAA 1200 CAAATTAAGG ATGCCATGGT TGCCACCTTCTTTGACATTT ACGAAGATGG AATCTTGG 1260 ATTGTAGTGC TAAGTAAAGG ATATACAAAGAATGATTTTG CCATTCATAC ACTAAAAA 1320 AACTTTGAAG CAGATGCTTA TTTTGTTAAAGTTATTGTTC TTAGTGGTCT GTGTTCTA 1380 GACTGTCCTC GTAAGATAAC GCCCTTTGGAGTGAATCAAC CTGGACCTTA TATCATGT 1440 ACAACTTTAG ATGCAAATGG GTATCTGAAAAATGGATCAG CTGGCCAACT CAGCCAAT 1500 GCACATTTAG CTCTCCAACT ACCATACAACGTGCTTGGTT TAGGTCGGAG CGCAAATT 1560 CTTGACCATC TCTACGTTGG TATTCCCCGTCCATCTGGAG AAAAATCTAT ACGAAAAC 1620 GAGTGGACTG CAATCATTCC AAATTCCCAGCTAATTGTCA TTCCATACCC TCACAATG 1680 CCTCGAAGTT GGAGTGCCAA ACTGTATCTTACACCAAGTA ATATTGTTCT GCTTACTG 1740 ATAGCTCTCA TCGGTGTCTG TGTTTTCATCTTGGCAATAA TTGGCATTTT ACATTGGC 1800 GAAAAGAAAG CAGATGATAG AGAAAAACGACAAGAAGCCC ACCGGTTTCA TTTTGATG 1860 ATGTGACTTG CCTTTAATAT TACATAATGGAATGGCTGTT CACTTGATTA GTTGAAAC 1920 AAATTCTGGC TTGAAAAAAT AGGGGAGATTAAATATTATT TATAAATGAT GTATCCCA 1980 GTAATTATTG GAAAGTATTC AAATAAATATGGTTTGAATA TGTCACAA 2028 372 base pairs nucleic acid single linearCORPNOT02 1990522 86 GCGAGCTCGG CTTCCTCAAC ATGGCTGCGC CCTTGTCAGTGGAGGTGGAG TTCGGAGGTG 60 GTGCGGAGCT CCTGTTTGAC GGTATTAAGA AACATCGAGTCACTTTGCCT GGACAGGAG 120 AACCCTGGGA CATCCGGAAC CTGCTCATCT GGATCAAGAAGAATTTGCTA AAAGAGCGG 180 CAGAGTTGTT CATCCAGGGA GACAGCGTGC GGCCAGGAATTCTGGTGCTG ATTAACGAT 240 CCGACTGGGA GCTACTGGGT GAGCTGGACT ACCAGCTTCAGGACCAGGAC AGCGTCCTC 300 TCATCTCCAC TCTGCACGGC GGCTGAGGGC CCTTCTCTGGGGCTGGGCAA CCTTAGAGG 360 GAGAACGAAA AA 372 829 base pairs nucleic acidsingle linear BRAITUT02 2098087 87 CAGGCTCTGT ATCCGTGGCA GCGGCCGTGGCAGGCTGGCT GGGTACCGGC TGTCGCTGAC 60 CCAGGAGAAG CTGCCTGTCT ACATCAGCCTGGGCTGCAGC GCGCTGCCGC CGCGGGGCC 120 GCAGCCATGG CCAAGGACAT CCTGGGTGAAGCAGGGCTAC ACTTTGATGA ACTGAACAA 180 CTGAGGGTGT TGGACCCAGA GGTTACCCAGCAGACCATAG AGCTGAAGGA AGAGTGCAA 240 GACTTTGTGG ACAAAATTGG CCAGTTTCAGAAAATAGTTG GTGGTTTAAT TGAGCTTGT 300 GATCAACTTG CAAAAGAAGC AGAAAATGAAAAGATGAAGG CCATCGGTGC TCGGAACTT 360 CTCAAATCTA TAGCAAAGCA GAGAGAAGCTCAACAGCAGC AACTTCAAGC CCTAATAGC 420 GAAAAGAAAA TGCAGCTAGA AAGGTATCGGGTTGAATATG AAGCTTTGTG TAAAGTAGA 480 GCAGAACAAA ATGAATTTAT TGACCAATTTATTTTTCAGA AATGAACTGA AAATTTCGC 540 TTTATAGTAG GAAGGCAAAA CAAAAAAAAGCCTCTCAAAA CCAAAAAAAC CTCTGTAGC 600 TTCCAGCGGC TTGACCAATG ACCTATGTCACAAGAGGTGC GTGTAAGGAA TGCAGCCCC 660 TGGAACGTGC TATTCACGTC TGTGGGAGCCAGTTTTAACA TCAGTGCACA GCTGCTGCT 720 GTGGCCCTGC AGTGTACGTT CTCACCTCTTATGCTTAGTT GGAACCCGAA CAAAAATAA 780 CTTTCATCCT TTTGTGTGTA ACTTCTCCAGAATACTAAAT AAAAAGTTC 829 1178 base pairs nucleic acid single linearBRAITUT03 2112230 88 CTTCCCGCCA GTCCCCTAAC CCTGAGGCTG CCGCGCGGCGGTCACTGCGC CGGGGTAGTG 60 GGCCCCAGTG TTGCGCTCTC TGGCCGTTCC TTACACTTTGCTTCAGGCTC CAGTGCAGG 120 GCGTAGTGGG ATATGGCCAA CTCGGGCTGC AAGGACGTCACGGGTCCAGA TGAGGAGAG 180 TTTCTGTACT TTGCCTACGG CAGCAACCTG CTGACAGAGAGGATCCACCT CCGAAACCC 240 TCGGCGGCGT TCTTCTGTGT GGCCCGCCTG CAGGATTTTAAGCTTGACTT TGGCAATTC 300 CAAGGCAAAA CAAGTCAAAC TTGGCATGGA GGGATAGCCACCATTTTTCA GAGTCCTGG 360 GATGAAGTGT GGGGAGTAGT ATGGAAAATG AACAAAAGCAATTTAAATTC TCTGGATGA 420 CAAGAAGGGG TTAAAAGTGG AATGTATGTT GTAATAGAAGTTAAAGTTGC AACTCAAGA 480 GGAAAAGAAA TAACCTGTCG AAGTTATCTG ATGACAAATTACGAAAGTGC TCCCCCATC 540 CCACAGTATA AAAAGATTAT TTGCATGGGT GCAAAAGAAAATGGTTTGCC GCTGGAGTA 600 CAAGAGAAGT TAAAAGCAAT AGAACCAAAT GACTATACAGGAAAGGTCTC AGAAGAAAT 660 GAAGACATCA TCAAAAAGGG GGAAACACAA ACTCTTTAGAACATAACAGA ATATATCTA 720 GGGTATTCTA TGTGCTAATA TAAAATATTT TTAACACTTGAGAACAGGGA TCTGGGGGA 780 CTCCACGTTT GATCCATTTT CAGCAGTGCT CTGAAGGAGTATCTTACTTG GGTGATTCC 840 TGTTTTTAGA CTATAAAAAG AAACTGGGAT AGGAGTTAGACAATTTAAAA GGGGTGTAT 900 AGGGCCTGAA ATATGTGACA AATGAATGTG AGTACCCCTTCTGTGAACAC TGAAAGCTA 960 TCTCTTGAAT TGATCTTAAG TGTCTCCTTG CTCTGGTAAAAGATAGATTT GTAGCTCA 1020 TGATGATGGT GCTGGTGAAT TGCTCTGCTC TGTCTGAGATTTTTAAAAAT CAGCTTAA 1080 AGAGTAATCT GCAGACAATT GATAATAACA TTTTGAAAATTGGAAAGATG GTATACTG 1140 TTTAGAGGAA TAAACGTATT TGTGGTTTAA AAAAAAAA 1178748 base pairs nucleic acid single linear BRSTTUT02 2117050 89CCCACGCGTC CGGCGACGGC GCGGACCTGG AGCTTCCGCG CGGTGGCTTC ACTCTCCTGT 60AAAACGCTAG AGCGGCGAGT TGTTACCTGC GTCCTCTGAC CTGAGAGCGA AGGGGAAAG 120GGCGAGATGA CTGACCGCTA CACCATCCAT AGCCAGCTGG AGCACCTGCA GTCCAAGTA 180ATCGGCACGG GCCACGCCGA CACCACCAAG TGGGAGTGGC TGGTGAACCA ACACCGCGA 240TCGTACTGCT CCTACATGGG CCACTTCGAC CTTCTCAACT ACTTCGCCAT TGCGGAGAA 300GAGAGCAAAG CGCGAGTCCG CTTCAACTTG ATGGAAAAGA TGCTTCAGCC TTGTGGACC 360CCAGCCGACA AGCCCGAGGA GAACTGAGAC TCTGCCTTAC CACCGCAGTG CGGGGGCAC 420TCTCCCAGCG TTTCTCCGGT TTGCCAATCC TCTTAAGTAT TCCTGTCTCC AAAGGACCG 480CTCTCCATGG CTCCTGCGCC TCGTGCTTTC CGCGTACAGA AGTGCTTGCC CGGGGAGTC 540CGCCTGACCT GCCTTCATGT GGACCCTTAG AACAGCACTG GGAGACCAGC AGGACTCCT 600AGAACTGTGC TGGTGGAGAG GTCCTAGAGC CGGCGAGCGT TTGAGAAGAG GGCATGGCG 660TGGAGTGAGA TGGGATTTGG CGTCTCGTTT TTGGCTAATT GATTGTCATT GGCTTTTTC 720ATAAAGTTTA GAAATCGTAA AAAAAAAA 748 1078 base pairs nucleic acid singlelinear SININOT01 2184712 90 GCAGGACGGA TTGGGCAAGG CTGGTCCCTG TGTGATGAGACATCACCCTC CCAGGAGCAA 60 GGCGGAAGTC TGGAGGACGC TGAGGGGCGG AGGCGGGAGAGGCGAGCTCG CGATGAGTG 120 TCTCGGCAGG CTCTTCGGGA AGGGGAAGAA GGAGAAAGGGCCAACCCCTG AAGAAGCAA 180 ACAGAAACTG AAGGAGACAG AGAAGATACT GATCAAGAAACAGGAATTTT TGGAGCAGA 240 GATTCAACAG GAGCTACAAA CAGCCAAGAA GTATGGGACCAAGAATAAGA GAGCTGCCC 300 ACAGGCTTTG CGGAGGAAGA AAAGATTCGA ACAGCAGCTGGCACAAACTG ACGGGACAT 360 ATCCACCCTG GAGTTTCAGC GTGAGGCCAT TGAGAATGCCACTACCAATG CAGAAGTCC 420 TCGTACCATG GAGCTTGCTG CCCAAAGCAT GAAGAAGGCCTACCAGGACA TGGACATTG 480 CAAGGTAGAT GAACTGATGA CTGACATCAC GGAACAACAGGAGGTGGCCC AGCAGATCT 540 AGATGCCATT TCTCGGCCTA TGGGCTTTGG AGATGATGTGGATGAGGATG AACTGCTGG 600 GGAGCTAGAG GAGCTGGAGC AGGAGGAATT GGCCCAGGAGTTGTTAAATG TGGGCGACA 660 GGAAGAAGAA CCCTCAGTCA AATTGCCTAG TGTACCTTCTACTCATCTGC CGGCAGGGC 720 AGCTCCCAAA GTGGATGAAG ATGAAGAAGC ACTAAAGCAGTTGGCTGAGT GGGTATCCT 780 ATAAATCTGG GCTTGTCTTC CTAATGCTAC CTTTGTTGGTCCTTTCTTCC TTAAGTGCC 840 AGTGCTGAGC TAAAGGAGGA TAACTTTTTG GGGAAGTCATGCTGAGGGTG GTAGTGTGA 900 CCTGCCTGAA AAAAGGGTCT CTTACCCTCC CAGCCCTGGCTCAACTCTGA AGAAGGATC 960 TGCTACAGAA GGAGCCCTTG GGCTCCCTTC TCTTTGATAGCAGTTATAAT GCCCTTGT 1020 CCAATAAAAC TGGGCAGATG GAATCCTAGT GTCTATACTGCCTTGTCTAC CCCTGAAG 1078 1446 base pairs nucleic acid single linearBRAINON01 2290475 91 TGGGAGGCGG AGGCACAACT AAGAGCGACC TAGCATCGCAAAGCCGCCCT CGGGGCGCTC 60 ATGGCGGGAC GCTCCTGGGA AAGGCTTTAG CGCGGTGTCTCTCTCTCTGG CCTTGGCCT 120 TGTGACTATC AGGTCCTCGC GCTGCCGCGG CATCCAGGCGTTCAGAAACT CGTTTTCAT 180 TTCTTGGTTT CATCTTAATA CCAACGTCAT GTCTGGTTCTAATGGTTCCA AAGAAAATT 240 TCACAATAAG GCTCGGACGT CTCCTTACCC AGGTTCAAAAGTTGAACGAA GCCAGGTTC 300 TAATGAGAAA GTGGGCTGGC TTGTTGAGTG GCAAGACTATAAGCCTGTGG AATACACTG 360 AGTCTCTGTC TTGGCTGGAC CCAGGTGGGC AGATCCTCAGATCAGTGAAA GTAATTTTT 420 TCCCAAGTTT AACGAAAAGG ATGGGCATGT TGAGAGAAAGAGCAAGAATG GCCTGTATG 480 GATTGAAAAT GGAAGACCGA GAAATCCTGC AGGACGGACTGGACTGGTGG GCCGGGGGC 540 TTTGGGGCGA TGGGGCCCAA ATCACGCTGC AGATCCCATTATAACCAGAT GGAAAAGGG 600 TAGCAGTGGA AATAAAATCA TGCATCCTGT TTCTGGGAAGCATATCTTAC AATTTGTTG 660 AATAAAAAGG AAAGACTGTG GAGAATGGGC AATCCCAGGGGGGATGGTGG ATCCAGGAG 720 GAAGATTAGT GCCACACTGA AAAGAGAATT TGGTGAGGAAGCTCTCAACT CCTTACAGA 780 AACCAGTGCT GAGAAGAGAG AAATAGAGGA AAAGTTGCACAAACTCTTCA GCCAAGACC 840 CCTAGTGATA TATAAGGGAT ATGTTGATGA TCCTCGAAACACTGATAATG CATGGATGG 900 GACAGAAGCT GTGAACTACC ATGACGAAAC AGGTGAGATAATGGATAATC TTATGCTAG 960 AGCTGGAGAT GATGCTGGAA AAGTGAAATG GGTGGACATCAATGATAAAC TGAAGCTT 1020 TGCCAGTCAC TCTCAATTCA TCAAACTTGT GGCTGAGAAACGAGATGCAC ACTGGAGC 1080 GGACTCTGAA GCTGACTGCC ATGCGTTGTA GCTGATGGTCTCCGTGTAAG CCAAAGGC 1140 ACAGAGGAGC ATATACTGAA AAGAAGGCAG TATCACAGAATTTATACTAT AAAAAGGG 1200 GGGTAGGCCA CTTGGCCTAT TTACTTTCAA AACAATTTGCATTTAGAGTG TTTCGCAT 1260 GAATAACATG AGTAAGATGA ACTGGAACAC AAAATTTTCAGCTCTTTGGT CAAAAGGA 1320 ATAAGTAATC ATATTTTGTA TGTATTCGAT TTAAGCATGGCTTAAATTAA ATTTAAAC 1380 CTAATGCTCT TTGAAGAATC ATAATCAGAA TAAAGATAAATTCTTGATCA GCTATAAA 1440 AAAAAA 1446 659 base pairs nucleic acid singlelinear LUNGNOT20 2353452 92 CCAGCTACCG AAGCACTGGA GAGTGTCATG GAGGCCTACGAGCAGGTCCA AAAGGGACCC 60 CTGAAGCTGA AAGGCGTCGC AGAGCTGGGA GTGACCAAGCGGAAGAAGAA AAAGAAGGA 120 AAAGACAAAG CGAAACTCCT GGAAGCAATG GGAACGAGCAAAAAGAACGA GGAGGAGAA 180 CGGCGCGGCC TGGACAAGCG GACCCCGGCC CAGGCGGCCTTCGAGAAAAT GCAGGAGAA 240 CGGCAAATGG AAAGGATCCT AAAGAAGGCA TCCAAAACCCACAAGCAGAG AGTGGAGGA 300 TTCAACAGAC ACCTGGACAC ACTCACGGAG CATTACGACATTCCCAAAGT CAGCTGGAC 360 AAGTAGCCGC CTGCCCCCAG TATGGAGCAG CATCGAGGGTTCGCAAAAGG CCACACTGG 420 GTTGTGTGTG TTTCCTTTGG TATATTCTGG AAACATGGCTACACACACCC TTGCATCTT 480 TGCTACAGAC TGCTTTTCGA AGCTGTGTAC CCTCATTCTGGAACTTGATT AAAGTAAGA 540 CGTCCTTGTA CTCAGTTTAG GCTTCTTGGC AACATACAGAAGATACACCC TTTTCGTTT 600 GATGGAAAGT TTCTAAGTTT ATCCAGAGGT AAAGCCCATTGTGTGTCTGT GTCATGTAA 659 1572 base pairs nucleic acid single linearTHP1NOT03 2469611 93 GGAAGGGGAA GTTTCGCCTC AGAAGGCTGC CTCGCTGGTCCGAATTCGGT GGCGCCACGT 60 CCGCCCGTCT CCGCCTTCTG CATCGCGGCT TCGGCGGCTTCCACCTAGAC ACCTAACAG 120 CGCGGAGCCG GCCGCGTCGT GAGGGGGTCG GCACGGGGAGTCGGGCGGTC TTGTGCATC 180 TGGCTACCTG TGGGTCGAAG ATGTCGGACA TCGGAGACTGGTTCAGGAGC ATCCCGGCG 240 TCACGCGCTA TTGGTTCGCC GCCACCGTCG CCGTGCCCTTGGTCGGCAAA CTCGGCCTC 300 TCAGCCCGGC CTACCTCTTC CTCTGGCCCG AAGCCTTCCTTTATCGCTTT CAGATTTGG 360 GGCCAATCAC TGCCACCTTT TATTTCCCTG TGGGTCCAGGAACTGGATTT CTTTATTTG 420 TCAATTTATA TTTCTTATAT CAGTATTCTA CGCGACTTGAAACAGGAGCT TTTGATGGG 480 GGCCAGCAGA CTATTTATTC ATGCTCCTCT TTAACTGGATTTGCATCGTG ATTACTGGC 540 TAGCAATGGA TATGCAGTTG CTGATGATTC CTCTGATCATGTCAGTACTT TATGTCTGG 600 CCCAGCTGAA CAGAGACATG ATTGTATCAT TTTGGTTTGGAACACGATTT AAGGCCTGC 660 ATTTACCCTG GGTTATCCTT GGATTCAACT ATATCATCGGAGGCTCGGTA ATCAATGAG 720 TTATTGGAAA TCTGGTTGGA CATCTTTATT TTTTCCTAATGTTCAGATAC CCAATGGAC 780 TGGGAGGAAG AAATTTTCTA TCCACACCTC AGTTTTTGTACCGCTGGCTG CCCAGTAGG 840 GAGGAGGAGT ATCAGGATTT GGTGTGCCCC CTGCTAGCATGAGGCGAGCT GCTGATCAG 900 ATGGCGGAGG CGGGAGACAC AACTGGGGCC AGGGCTTTCGACTTGGAGAC CAGTGAAGG 960 GCGGCCTCGG GCAGCCGCTC CTCTCAAGCC ACATTTCCTCCCAGTGCTGG GTGCGCTT 1020 CAACTGCGTT CTGGCTAACA CTGTTGGACC TGACCCACACTGAATGTAGT CTTTCAGT 1080 GAGACAAAGT TTCTTAAATC CCGAAGAAAA ATATAAGTGTTCCACAAGTT TCACGATT 1140 CATTCAAGTC CTTACTGCTG TGAAGAACAA ATACCAACTGTGCAAATTGC AAAACTGA 1200 ACATTTTTTG GTGTCTTCTC TTCTCCCCTT TCCGTCTGAATAATGGGTTT TAGCGGGT 1260 TAGTCTGCTG GCATTGAGCT GGGGCTGGGT CACCAAACCCTTCCCAAAAG GACCCTTA 1320 TCTTCTCTTG CACACATGCC TCTCTCCCAC TTTTCCCAACCCCCACATTT GCAACTAG 1380 GAGGTTTGCC ATAAAATTGC TCTGCCCTTG ACAAGTTCTGTTAATTTATT GACTTTTG 1440 AAGGCCTGGT CACAACAATC ATATTTCACG TATTTTCCCCCTTTGGTGGC ANGACTGT 1500 GCAATAGGGG GAGAAGACAA GCAGCGGATG GAAGCGTTTTTCTCAAGTTT TGGGAATT 1560 TTCGANCTGA CA 1572 3520 base pairs nucleic acidsingle linear LIVRTUT04 2515476 94 GAGAAGCCAA GGAAGGAAAC AGGGAAAAATGTCGCCATGA AGGCCGAGAA CCGCTGCCGC 60 CGCCGACCCC CGCCGGCCCT GAACGCCATGAGCCTGGGTC CCCGCCGCGC CCGCTCCGC 120 TCGACTGCCG TCGCCGCCGA GGCCCCCGTTGATGCCGCTG AGCTCCCCCA ACGCCGCCG 180 CACCGCCTCC GACATGGACA AGAACAGCGGCTCCAACAGC TCCTCCGCCT CTTCGGGCA 240 CAGCAAAGGG CAACAGCCGC CCCGCTCCGCCTCGGCGGGG CCAGCCGGCG AGTCTAAAC 300 CAAGAGCGAT GGAAAGAACT CCAGTGGATCCAAGCGTTAT AATCGCAAAC GTGAACTTT 360 CTACCCCAAA AATGAAAGTT TTAACAACCAGTCCCGTCGC TCCAGTTCAC AGAAAAGCA 420 GACTTTTAAC AAGATGCCTC CTCAAAGGGGCGGCGGCAGC AGCAAACTCT TTAGCTCTT 480 TTTTAATGGT GGAAGACGAG ATGAGGTAGCAGAGGCTCAA CGGGCAGAGT TTAGCCCTG 540 CCAGTTCTCT GGTCCTAAGA AGATCAACCTGAACCACTTG TTGAATTTCA CTTTTGAAC 600 CCGTGGCCAG ACGGGTCACT TTGAAGGCAGTGGACATGGT AGCTGGGGAA AGAGGAACA 660 GTGGGGACAT AAGCCTTTTA ACAAGGAACTCTTTTTACAG GCCAACTGCC AATTTGTGG 720 GTCTGAAGAC CAAGACTACA CAGCTCATTTTGCTGATCCT GATACATTAG TTAACTGGG 780 CTTTGTGGAA CAAGTGCGCA TTTGTAGCCATGAAGTGCCA TCTTGCCCAA TATGCCTCT 840 TCCACCTACT GCAGCCAAGA TAACCCGTTGTGGACACATC TTCTGCTGGG CATGCATCC 900 GCACTATCTT TCACTGAGTG AGAAGACGTGGAGTAAATGT CCCATCTGTT ACAGTTCTG 960 GCATAAGAAG GATCTCAAGA GTGTTGTTGCCACAGAGTCA CATCAGTATG TTGTTGGT 1020 TACCATTACG ATGCAGCTGA TGAAGAGGGAGAAAGGGGTG TTGGTGGCTT TGCCCAAA 1080 CAAATGGATG AATGTAGACC ATCCCATTCATCTAGGAGAT GAACAGCACA GCCAGTAC 1140 CAAGTTGCTG CTGGCCTCTA AGGAGCAGGTGCTGCACCGG GTAGTTCTGG AGGAGAAA 1200 AGCACTAGAG CAGCAGCTGG CAGAGGAGAAGCACACTCCC GAGTCCTGCT TTATTGAG 1260 AGCTATCCAG GAGCTCAAGA CTCGGGAAGAGGCTCTGTCG GGATTGGCCG GAAGCAGA 1320 GGAGGTCACT GGTGTTGTGG CTGCTCTGGAACAACTGGTG CTGATGGCTC CCTTGGCG 1380 GGAGTCTGTT TTTCAACCCA GGAAGGGTGTGCTGGAGTAT CTGTCTGCCT TCGATGAA 1440 AACCACGGAA GTTTGTTCTC TGGACACTCCTTCTAGACCT CTTGCTCTCC CTCTGGTA 1500 AGAGGAGGAA GCAGTGTCTG AACCAGAGCCTGAGGGGTTG CCAGAGGCCT GTGATGAC 1560 GGAGTTAGCA GATGACAATC TTAAAGAGGGGACCATTTGC ACTGAGTCCA GCCAGCAG 1620 ACCCATCACC AAGTCAGGCT TCACACGCCTCAGCAGCTCT CCTTGTTACT ACTTTTAC 1680 AGCGGAAGAT GGACAGCATA TGTTCCTGCACCCTGTGAAT GTGCGCTGCC TCGTGCGG 1740 GTACGGCAGC CTGGAGAGGA GCCCCGAGAAGATCTCAGCA ACTGTGGTGG AGATTGCT 1800 CTACTCCATG TCTGAGGATG TTCGACAGCGTCACAGATAT CTCTCTCACT TGCCACTC 1860 CTGTGAGTTC AGCATCTGTG AACTGGCTTTGCAACCTCCT GTGGTCTCTA AGGAAACC 1920 AGAGATGTTC TCAGATGACA TTGAGAAGAGGAAACGTCAG CGCCAAAAGA AGGCTCGG 1980 GGAACGCCGC CGAGAGCGCA GGATTGAGATAGAGGAGAAC AAGAAACAGG GCAAGTAC 2040 AGAAGTCCAC ATTCCCCTCG AGAATCTACAGCAGTTTCCT GCCTTCAATT CTTATACC 2100 CTCCTCTGAT TCTGCTTTGG GTCCCACCAGCACCGAGGGC CATGGGGCCC TCTCCATT 2160 TCCTCTCAGC AGAAGTCCAG GTTCCCATGCAGACTTTCTG CTGACCCCTC TGTCACCC 2220 TGCCAGTCAG GGCAGTCCCT CATTCTGCGTTGGGAGTCTG GAAGAAGACT CTCCCTTC 2280 TTCCTTTGCC CAGATGCTGA GGGTTGGAAAAGCAAAAGCA GATGTGTGGC CCAAAACT 2340 TCCAAAGAAA GATGAGAACA GCTTAGTTCCTCCTGCCCCT GTGGACAGCG ACGGGGAG 2400 TGATAATTCA GACCGTGTTC CTGTGCCCAGTTTTCAAAAT TCCTTCAGCC AAGCTATT 2460 AGCAGCCTTC ATGAAACTGG ACACACCAGCTACTTCAGAT CCCCTCTCTG AAGAGAAA 2520 AGGAAAGAAA AGAAAAAAAC AGAAACAGAAGCTCCTGTTC AGCACCTCAG TCGTCCAC 2580 CAAGTGACAC TACTGGCCCA GGCTACCTTCTCCATCTGGT TTTTGTTTTT GTTTTTTT 2640 CCCCCATGCT TTTGTTTGGC TGCTGTAATTTTTAAGTATT TGAGTTTGAA CAGATTAG 2700 CTGGGGGGAG GGGGTTTCCA CAATGTGAGGGGGAACCAAG AAAATTTTAA ATACAGTG 2760 TTTTCCAGCT TCCTGTCTTT ACACCAAAATAAAGTATTGA CACAAGAGAT CTCTTCCT 2820 CAAGGTTTTT AGTTCATTGC CAGTTTAGTCTTTTTGACCC ATGTGTAATT AATTTTTC 2880 AACCCAAAGT AAGATTGAGT CCCCTTTGAGATGCATTAGA GCAGTCCAAC CCAGAATG 2940 ACACACTGCT CTGCTGTACC ATCATGTCAGGGCTTCCTGG ACTCAGTACA CCTCTCAG 3000 TGTCTTTTAA AAAACAGCTG AATCTTTACTACCTATTTAG TTCTCCTTGT TAAAGAAA 3060 GGGGTGGGAA TAAAATGGAT TTAGGACACCCAGTTTGAAT TGCAGTTTTT TTTTTTCT 3120 CACATGGCCA GGCTGTGGTG CCAGCTTAATGGAGTAGGCT GTCCTTGGCA CTTGCATG 3180 TGAAAGGAGG GTTTTGCCTC TTCTTGAGCATGGCTTGAGT TGGTAAGGAA AGCTGTAA 3240 CACGAAGCCC TGAGACCTGC TACCCCTAAGATCGAGCTTG TTTTCAGTGA CTGGCTTG 3300 TCATAGGAGG AGGAGTCTGG TACAGCTGCAGGAGAGCAGG GCCATCTGAA GCGGTAGC 3360 TGCCACCATC TCCCTCTCAT CTAGAGCAGTTTTCTTATGC CTTGGTTTGA GCTGAATT 3420 ATGTGAATTC TTTTGCTGCT TAATAAAGTGACCTCTAGGT GCATTAGAAT GCGAAGGC 3480 ATAGTTGCAA TAAATCACCT GCACAAGCAAAAAAAAAAAA 3520 1904 base pairs nucleic acid single linear THP1AZS082754573 95 CGGGGGAGTC GGAGGAGGTG GCGGCGCTGG ANNTCCTCCC GGGGACCAGCGACCCGGGAG 60 CATGCACGTC GTCGCACCAG CTTCACTGAG GCTTGGAACA GGAACTAATCTCCCTCCAT 120 TCCAACTTGC TTGACAAAGC TCGCTCTTCC TCCAGCCGCT GAGCCGTCCCTTCTCGCCA 180 GTCCCAGAGC AGGCACCGCG CCGAGGCCCC GCCGCTGGAG CGCGAGGACAGTGGGACCT 240 CAGTTTGGGG AAGATGATAA CAGCTAAGCC AGGGAAAACA CCGATTCAGGTATTACACG 300 ATACGGCATG AAGACCAAGA ACATCCCAGT TTATGAATGT GAAAGATCTGATGTGCAAA 360 ACACGTGCCC ACTTTCACCT TCAGAGTAAC CGTTGGTGAC ATAACCTGCACAGGTGAAG 420 TACAAGTAAG AAGCTGGCGA AACATAGAGC TGCAGAGGCT GCCATAAACATTTTGAAAG 480 CAATGCAAGT ATTTGCTTTG CAGTTCCTGA CCCCTTAATG CCTGACCCTTCCAAGCAAC 540 AAAGAACCAG CTTAATCCTA TTGGTTCATT ACAGGAATTG GCTATTCATCATGGCTGGA 600 ACTTCCTGAA TATACCCTTT CCCAGGAGGG AGGACCTGCT CATAAGAGAGAATATACTA 660 AATTTGCAGG CTAGAGTCAT TTATGGAAAC TGGAAAGGGG GCATCAAAAAAGCAAGCCA 720 AAGGAATGCT GCTGAGAAAT TTCTTGCCAA ATTTAGTAAT ATTTCTCCAGAGAACCACA 780 TTCTTTAACA AATGTAGTAG GACATTCTTT AGGATGTACT TGGCATTCCTTGAGGAATT 840 TCCTGGTGAA AAGATCAACT TACTGAAAAG AAGCCTCCTT AGTATTCCAAATACAGATT 900 CATCCAGCTG CTTAGTGAAA TTGCCAAGGA ACAAGGTTTT AATATAACATATTTGGATA 960 AGATGAACTG AGCGCCAATG GACAATATCA ATGTCTTGCT GAACTGTCCACCAGCCCC 1020 CACAGTCTGT CATGGCTCCG GTATCTCCTG TGGCAATGCA CAAAGTGATGCAGCTCAC 1080 TGCTTTGCAG TATTTAAAGA TAATAGCAGA AAGAAAGTAA ATCTGGAGCAACTTAAAA 1140 TCTTTCAGTA GCACATAAAA AGTTCCCCTC TGGCCCCTTC CCAAGTAAAACTTTTACC 1200 AGTGTTTATG TCTTGTTTCT AAATCTCTTC ATAGATTCCA TCAACACTCCAGATTTAA 1260 ATCTCCTCAT AGTTGTTATT AAGCTCTTTT TAATGGCTTC AACTTTGTATCAGTATAC 1320 TATTTATAAA CTTTGTACCA CAAGAGAGAG TGTAGCACCC ATTTTACAGTGCCATGCA 1380 TCAGAGAAAG AAACTGCATG TTTGTTGTTG ATGATGAAAT AAAAATGCTAGCGACAGT 1440 TTCTTACTGG TGCTTAAGCT CTTCTTTGCA CAAAGCTTTA TAAAGGGAATTCAAAGGA 1500 CCCTTTAGAA TTAGAGTCTT GAGGGACAGC ACTAACAGGC CTTTATTAAGTATGATTG 1560 TGTTAAATTT CAGGGAACAT GATTGGTCTG CTGTGTATTT GAATTCATGTAACAAAGA 1620 TGTTACGATG GGATTCTGCT CATTTTATTA AAAAGCTACT GACTTGACTGTCATCCTG 1680 CTTGTTAGCC ATTGTGAATA AGATTTTAAT GTTGATAATT CTGTTATTTACATATCTC 1740 ATTTACTTTG AAATTCAAAG GTGAAAATAA AAAATGATGG CCTAAGTAAAATTTACAA 1800 ATACATGTAA TGGGTTACTT CCTTACTTTC TCTAAGCGTA TGCAACCTTGTATTTTTC 1860 TGACATAACC ATACACAGGT GAATTACTAG TTTAAAAAGC ATTT 1904 1621base pairs nucleic acid single linear TLYMNOT04 2926777 96 ATTTGGGTTTTTAAAGGGAA GGCGTCCGCG CGGCGGCCAT TTTGTCTTGT CGGCTCCTGT 60 GTGTAGGAGGGATTTCGGCC TGAGAGCGGG CCGAGGAGAT TGGCGACGGT GTCGCCCGT 120 CTTTTCGTTGGCGGGTGCCT GGGCTGGTGG GAACAGCCGC CCGAAGGAAG CACCATGAT 180 TCGGCCGCGCAGTTGTTGGA TGAGTTAATG GGCCGGGACC GAAACCTAGC CCCGGACGA 240 AAGCGCACAAACGTGCGGTG GGACCACGAG AGCGTTTGTA AATATTATCT CTGTGGTTT 300 TGTCCTGCGGAATTGTTCAC AAATACACGT TCTGATCTTG GTCCGTGTGA AAAAATTCA 360 GATGAAAATCTACGAAAACA GTATGAGAAG AGCTCTCGTT TCATGAAAGT TGGCTATGA 420 AGAGATTTTTTGCGATACTT ACAGAGCTTA CTTGCAGAAG TAGAACGTAG GATCAGACG 480 GGCCATGCTCGTTTGGCATT ATCTCAAAAC CAGCAGTCTT CTGGGGCCGC TGGCCCAAC 540 GGCAAAAATGAAGAAAAAAT TCAGGTTCTA ACAGACAAAA TTGATGTACT TCTGCAACA 600 ATTGAAGAATTAGGGTCTGA AGGAAAAGTA GAAGAAGCCC AGGGGATGAT GAAATTAGT 660 GAGCAATTAAAAGAAGAGAG AGAACTGCTA AGGTCCACAA CGTCGACAAT TGAAAGCTT 720 GCTGCACAAGAAAAACAAAT GGAAGTTTGT GAAGTATGTG GAGCCTTTTT AATAGTAGG 780 GATGCCCAGTCCCGGGTAGA TGACCATTTG ATGGGAAAAC AACACATGGG CTATGCCAA 840 ATTAAAGCTACTGTAGAAGA ATTAAAAGAA AAGTTAAGGA AAAGAACCGA AGAACCTGA 900 CGTGATGAGCGTCTAAAAAA GGAGAAGCAA GAAAGAGAAG AAAGAGAAAA AGAACGGGA 960 AGAGAAAGGGAAGAAAGAGA AAGGAAAAGA CGAAGGGAAG AGGAAGAAAG AGAAAAAG 1020 AGGGCTCGTGACAGAGAAAG AAGAAAGAGA AGTCGTTCAC GAAGTAGACA CTCAAGCC 1080 ACATCAGACAGAAGATGCAG CAGGTCTCGG GACCACAAAA GGTCACGAAG TAGAGAAA 1140 AGGCGGACCAGAAGTAGAGA TCGACGAAGA AGCAGAAGCC ATGATCGATC AGAAAGAA 1200 CACAGATCTCGAAGTCGGGA TCGAAGAAGA TCAAAAAGCC GGGATCGAAA GTCATATA 1260 CACAGGAGCAAAAGTCGGGA CAGAGAACAA GATAGAAAAT CCAAGGAGAA AGAAAAGA 1320 GGATCTGATGATAAAAAAAG TAGTGTGAAG TCCGGTAGTC GAGAAAAGCA GAGTGAAG 1380 ACAAACACTGAATCGAAGGA AAGTGATACT AAGAATGAGG TCAATGGGAC CAGTGAAG 1440 ATTAAATCTGAAGGTGACAC TCAGTCCAAT TAAAACTGAT CTGATAAGAC CTCAGATC 1500 ACAGAGGACTACTGTTCGAA GATTTTTGGA AGAATACTGA GAACGGCATA AAGTGAAG 1560 CGACATTTAAAAAATGAGGT GAAAGAAAGC TATAGTGGCA TAGAAAAAGT ATAAAGCT 1620 G 1621 1112base pairs nucleic acid single linear TESTNOT07 3217567 97 CAACGATCGTGGGNCAGGTA GGTGGTTTCT GGTTTGTTGG GGCGTGTGTA TGTGTATTTA 60 GGGGGACTGAAGGGTACGTG GGGCGAAACA AAACCGGCCA TGGCAGCAGC GGAGGAGGA 120 GACGGGGGCCCCGAAGGGCC AAATCGCGAG CGGGGCGGGG CGGGCGCGAC CTTCGAATG 180 AATATATGTTTGGAGACTGC TCGGGAAGCT GTGGTCAGTG TGTGTGGCCA CCTGTACTG 240 TGGCCATGTCTTCATCAGTG GCTGGAGACA CGGCCAGAAC GGCAAGAGTG TCCAGTATG 300 AAAGCTGGGATCAGCAGAGA GAAGGTTGTC CCGCTTTATG GGCGAGGGAG CCAGAAGCC 360 CAGGATCCCAGATTAAAAAC TCCACCCCGC CCCCAGGGCC AGAGACCAGC TCCGGAGAG 420 AGAGGGGGATTCCAGCCATT TGGTGATACC GGGGGCTTCC ACTTCTCATT TGGTGTTGG 480 GCTTTTCCCTTTGGCTTTTT CACCACCGTC TTCAATGCCC ATGAGCCTTT CCGCCGGGG 540 ACAGGTGTGGATCTGGGACA GGGTCACCCA GCCTCCAGCT GGCAGGATTC CCTCTTCCT 600 TTTCTCGCCATCTTCTTCTT TTTTTGGCTG CTCAGTATTT GAGCTATGTC TGCTTCCTG 660 CCACCTCCAGCCAGAGAAGA ATCAGTATTG AGGGTCCCTG CTGACCCTTC CGTACTCCT 720 GACCCCCTTGACCCCTCTAT TTCTGTTGGC TAAGGCCAGC CCTGGACATT GTCCAGGAA 780 GCCTGGGGAGGAGGAGTGAA GTCTGTGCAT AGATGGGAGA GCCTTCTGCT CAGAGGCTC 840 CTCAGTAACGTTGTTTAATT CTCTGCCCTG GGGAAGGAGG ATGGATTGAG AGAATGTCT 900 TCTCCTCTCCTAAGTCTTTG CTTTCCCTGA TTTCTTGATT TGATCTTCAA AGGTGGGCA 960 AGTTCCCTCTGACTCTTCCC CCACTCCCCA TCTTACTGAT TTAATTTAAT TTTTCACT 1020 CCAGAGTCTAATATGGATTC TGACTCTTAA GTGCTTCCGC CCCCTCACTA CCTCCTTT 1080 TACAAATTCAATAAAAAAGG TGAAATATAA AA 1112 1040 base pairs nucleic acid single linearSPLNNOT10 3339274 98 CGAGCCCNCC CCCAGCGGGA GCTGTGGGGC AGAGGCGCTGCTGTGGTTGG TCAGTCCAGT 60 AAGAAGCCAG CAGGGCTGGG TGCTGGGGCT TCTTCTCCTGAAGGGGCTGC AAGAGGGAA 120 GCTTAGCCAT GTCGTCCTTG ATCAGAAGGG TGATCAGCACCGCGAAAGCC CCAGGGGCC 180 TTGGACCCTA CAGTCAAGCT GTATTAGTCG ACAGGACCATTTACATTTCA GGACAGATA 240 GCATGGACCC TTCAAGTGGA CAGCTTGTGT CAGGAGGGGTAGCAGAAGAA GCTAAACAA 300 CTCTTAAAAA CATGGGTGAA ATTCTGAAAG CTGCAGGCTGTGACTTCACT AACGTGGTG 360 AAACAACTGT TCTTCTGGCT GACATAAATG ACTTCAATACTGTCAATGAA ATCTACAAA 420 AGTATTTCAA GAGTAATTTT CCTGCTAGAG CTGCTTACCAAGTTGCTGCT TTACCCAAA 480 GCAGCCGAAT TGAAATTGAA GCAGTAGCTA TCCAAGGACCACTGACAACG GCATCACTA 540 AAGTGGGCCC AGTGCTGTGT AGTCTGGAAT TGTTAACATTTTAATTTTTA CAATTGATG 600 AACATCTTAA TTAACCTTTT AATTTTCACA ATTGATGACAGTGTGAGTTT GATGAAAAT 660 TCTGAAGCTA TTATGGAAAT ACCATGTAAT AGGGAGAGTTGAACATGAAT ATTAGAGAA 720 GAATCCAGTT ACTTTTTTAA ATTACACCTG TGTGCACCTGTATTACTGAA TATAGGAAA 780 AGATACCCAT TACATAGTTA CTCAGTAAAC AAAAGAGAAATACCAGGTAG GAAAGAAGA 840 TTACTATTCC TGAGAAATAA TCAAGAACAT ATTTAATTTAAACTAATGAT GTGAACTAT 900 TAGTTTTGAT GTCCGTTATG TGATTCTGCT TTTACTTGAGTAAAATTAAA GTGTTTAAA 960 TTGAGATCAA GGAGAAGATA GTGGAACAAA ATGTTATATAGATAATATTT TTCTAATG 1020 AATAAAATAG GCAGATTTCC 1040

What is claimed is:
 1. A purified protein comprising an amino acidsequence selected from SEQ ID NOs: 1-49.
 2. An isolated polynucleotidecomprising a nucleic acid sequence encoding the protein of claim 1 orthe complement of the polynucleotide.
 3. A composition comprising apolynucleotide of claim 2 and a reporter molecule.
 4. An isolatedpolynucleotide comprising a nucleic acid sequence selected from SEQ IDNOs:50-98 and the complement of the polynucleotide.
 5. A vectorcontaining the polynucleotide of claim
 2. 6. A host cell containing thevector of claim
 5. 7. A method for using a polynucleotide to produce aprotein comprising: a) culturing the host cell of claim 6 underconditions for the expression of the protein, and b) recovering theprotein from the host cell culture.
 8. A method for using apolynucleotide to detect expression of a nucleic acid in a sample, themethod comprising: a) hybridizing the polynucleotide of claim 2 tonucleic acids of the sample, thereby forming a hybridization complex,and b) detecting hybridization complex formation, wherein complexformation indicates the expression of the polynucleotide in the sample.9. The method of claim 8 wherein the polynucleotide is attached to asubstrate or bonded to the surface of a microarray.
 10. The method ofclaim 8 wherein the nucleic acids of the sample are amplified prior tohybridization.
 11. A method of using a polynucleotide to screen aplurality of molecules to identify a ligand, the method comprising: a)combining the polynucleotide of claim 2 with a plurality of moleculesunder conditions to allow specific binding, and b) detecting specificbinding, thereby identifying a ligand which specifically binds thepolynucleotide.
 12. The method of claim 11 wherein the molecules areselected from DNA molecules, RNA molecules, peptide nucleic acids,artificial chromosome constructions, peptides, and transcriptionfactors.
 13. A method for diagnosing a disease associated with geneexpression in a sample containing nucleic acids, the method comprising:a) hybridizing a polynucleotide of claim 2 to nucleic acids of thesample under conditions to form a hybridization complex, b) comparinghybridization complex formation with standards, thereby diagnosing thedisease.
 14. The method of claim 13 wherein expression is diagnostic ofcancer or immune response.
 15. A composition comprising the protein ofclaim 1 and a pharmaceutical carrier or a labeling moiety.
 16. A methodfor using a protein to screen a plurality of molecules to identify aligand, the method comprising: a) combining the protein of claim 1 withthe molecules under conditions to allow specific binding, and b)detecting specific binding, thereby identifying a ligand whichspecifically binds the protein.
 17. The method of claim 16 wherein themolecules are selected from DNA molecules, RNA molecules, peptidenucleic acids, peptides, pharmaceutical agents, proteins, mimetics,agonists, antagonists, Ned antibodies, immunoglobulins, inhibitors, anddrugs.
 18. A method of using a protein to prepare and purify antibodiescomprising: a) immunizing a animal with the protein of claim 1 underconditions to elicit an antibody response, b) isolating animalantibodies, c) attaching the protein to a substrate, d) contacting thesubstrate with isolated antibodies under conditions to allow specificbinding to the protein, e) dissociating the antibodies from the protein,thereby obtaining purified antibodies.
 19. An antibody whichspecifically binds a protein of claim
 1. 20. A method for using anantibody to detect protein expression in a sample, the methodcomprising: a) combining the antibody of claim 19 with a sample underconditions to form antibody:protein complexes, and b) detecting complexformation with standards, wherein detection indicates expression of theprotein in the sample.